Blaise Cruz

profile2.jpg

Mabuhay! đź‘‹

I’m a PhD student at MBZUAI supervised by Dr. Alham Fikri Aji specializing in problems at the intersection of Multilinguality and Low-resource Languages.

Particularly, I am interested in understanding the behavior of models when constrained under low-resource multilingual domains. I’ve collaborated with many talented colleagues on various topics under this umbrella, including:

Prior to my PhD, I was Lead Research Engineer at Samsung Research in the Philippines where I worked on low-resource machine translation and dialogue generation. I have also been previously affiliated with the University of the Philippines, De La Salle University, and Senti AI.

If you’re interested in collaborating or if you want to chat about low-resource languages, feel free to get in touch! You may reach me through my email me (at) blaisecruz (dot) com.


News

Jun 17, 2024 The preprint for our paper SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages is out!
Jun 12, 2024 The preprint for our paper CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark is out!
May 15, 2024 I’ll be joining the Mohammed bin Zayed University of Artificial Intelligence as a PhD student this Fall 2024!
Mar 06, 2024 The SEACrowd Data Catalogue – the main consolidated repositority for all datasets collected by the SEACrowd Project – is now live!

Latest Posts

Jun 12, 2024 Welcome!

Selected Publications

  1. EMNLP
    Multilingual Large Language Models Are Not (Yet) Code-Switchers
    Ruochen Zhang, Samuel Cahyawijaya, Jan Christian Blaise Cruz, Genta Indra Winata, and Alham Fikri Aji
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2023
  2. LREC
    Improving Large-scale Language Models and Resources for Filipino
    Jan Christian Blaise Cruz, and Charibeth Cheng
    In Proceedings of the 13th Language Resources and Evaluation Conference , 2022
  3. WMT
    Data Processing Matters: SRPH-Konvergen AI’s Machine Translation System for WMT’21
    Lintang Sutawika, and Jan Christian Blaise Cruz
    In Proceedings of the Sixth Conference on Machine Translation , 2021