Blaise Cruz

profile3.jpg

Mabuhay! đź‘‹

I’m a PhD student at MBZUAI supervised by Dr. Alham Fikri Aji working on novel approaches to Modeling Multilinguality. In particular, my work centers on methods that reframe multilinguality in an efficient and linguistically-motivated manner, beyond simply cramming hundreds of unique languages within the same billion-ish parameters.

Besides this, I’ve worked on various other topics under the multilinguality and low-resource umbrella, including:

Prior to my PhD, I was Lead Research Engineer at Samsung Research where I worked on low-resource machine translation and dialogue generation. I’ve also previously been affiliated with Mila - Quebec AI Institute and McGill University, the University of the Philippines, De La Salle University, and Senti AI.

If you’re interested in collaborating or if you want to chat about low-resource languages, feel free to get in touch! You may reach me through my email me (at) blaisecruz (dot) com.


News

Jan 20, 2026 Our new paper on algorithm-focused benchmarking for competitive programming, Idea First, Code Later, is finally out!
Jan 16, 2026 Proud to release my new work, Multilinguality as Sense Adaptation! Many thanks to Mila - Quebec AI Institute and McGill NLP for hosting me in Montréal and making the work possible.
Aug 21, 2025 Three papers accepted for EMNLP 2025!
Aug 12, 2025 We’re proud to release FilBench, the first Open LLM Evaluation Suite and Leaderboard for Filipino!
Jul 09, 2025 We’re excited to announce MoMentS, a new comprehensive multimodal benchmark for theory of mind in large language models!

Latest posts

Jun 12, 2024 Welcome!

Selected Publications

  1. arXiv
    Multilinguality as Sense Adaptation
    Jan Christian Blaise Cruz, David Ifeoluwa Adelani, and Alham Fikri Aji
    2026
  2. EMNLP
    FilBench: Can LLMs Understand and Generate Filipino?
    Lester James V Miranda*, Elyanah Aco*, Conner Manuel*, Jan Christian Blaise Cruz†, and Joseph Marvin Imperial†
    In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
  3. EMNLP
    Oral Presentation
    Multilingual Large Language Models Are Not (Yet) Code-Switchers
    Ruochen Zhang*, Samuel Cahyawijaya*, Jan Christian Blaise Cruz*, Genta Indra Winata*, and Alham Fikri Aji*
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023