Qian Yang | Mila - Quebec AI Institute

I am a third-year Ph.D candidate (2023.09 – ) at Mila and Université de Montréal, advised by Prof. Aishwarya Agrawal. My research centers on faithful and efficient multimodal foundation models. My current work focuses on visual reasoning for spatial understanding with unified multimodal modeling. I have also contributed work on online data selection for MLLM training (CVPR 2026), modality alignment for resource-efficient MLLMs (CVPR 2025 Highlight), benchmarking code hallucination (AAAI 2025), and task decomposition for faithful multimodal reasoning (EMNLP 2024). I co-organized the VLMs4All workshop at CVPR 2025, centered on diversity in vision-language models. Before that I obtained my Master's degree in Computer Science from Harbin Institute of Technology, Shenzhen, China (2020-2023) under the supervision of Prof. Baotian Hu. I obtained my Bachelor's degree in Computer Science from the University of Electronic Science and Technology of China (2016-2020). I was a research intern at Alibaba DAMO Academy.

News

2026.04: Awarded the FRQNT Doctoral Scholarship (25,000 CAD/year, 2026-2028)!
2026.03: Awarded the AI Scholarship from Université de Montréal (10,000 CAD)!
2026.02: One paper is accepted by CVPR 2026!
2025.03: One paper is accepted by CVPR 2025 Highlight !
2024.12: One paper is accepted by AAAI 2025!
2024.09: Two papers are accepted by EMNLP and EMNLP Findings 2024!
2023.09: Start my Ph.D at Mila!
2023.06: One paper is accepted by ACM MM 2023!
2023.03: Obtain my master degree from Harbin Institute of Technology, Shenzhen!
2022.05: I join Alibaba DAMO Academy as a research intern!
2022.06: One paper is accepted by ACM MM 2022!
2022.01: One paper is accepted by IEEE Transactions on Multimedia!

Publications ( ^* denotes equal contribution)

2026

How and What to Imagine? Visual Thinking in Unified Multimodal Models for Cross-View Spatial Reasoning

Qian Yang, Ankur Sikarwar, Huy Le, Le Zhang, Zhuan Shi, Perouz Taslakian, Aishwarya Agrawal

Preprint.

arXiv
Learning What Matters: Prioritized Concept Learning via Relative Error-driven Sample Selection

Qian Yang^*,Shivam Chandhok^*, Oscar Manas, Kanishk Jain, Leonid Sigal, Aishwarya Agrawal

The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2026.

CVPR Project Website
Discovering Failure Modes in Vision-Language Models using RL

Kanishk Jain, Qian Yang, Shravan Nayak, Parisa Kordjamshidi, Nishanth Anand, Aishwarya Agrawal

Preprint.

arXiv

2025

Assessing and Learning Alignment of Unimodal Vision and Language Models

Le Zhang, Qian Yang, Aishwarya Agrawal

The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025.

CVPR Highlight Project Website

2024

Decompose and Compare Consistency: Measuring VLMs' Answer Reliability via Task-Decomposition Consistency Comparison

Qian Yang, Weixiang Yan, Aishwarya Agrawal

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.

EMNLP
MuGI: Enhancing Information Retrieval through Multi-Text Generation Intergration with Large Language Models

Le Zhang, Yihong Wu, Qian Yang, Jian-Yun Nie

Findings of the Association for Computational Linguistics: EMNLP 2024.

EMNLP Findings
CodeHalu: Code Hallucinations in LLMs Driven by Execution-based Verification

Yuchen Tian^*, Weixiang Yan^*, Qian Yang, Xuandong Zhao, Qian Chen, Wen Wang, Ziyang Luo, Lei Ma, Dawn Song

The 39th Annual AAAI Conference on Artificial Intelligence.

AAAI

2023

Enhancing Multi-modal and Multi-hop Question Answering via Structured Knowledge and Unified Retrieval-Generation

Qian Yang, Qian Chen, Wen Wang, Baotian Hu, Min Zhang

Proceedings of the 31st ACM International Conference on Multimedia, 2023.

ACM MM

2022

Chunk-aware Alignment and Lexical Constraint for Visual Entailment with Natural Language Explanations

Qian Yang^*, Yunxin Li^*, Baotian Hu, Lin Ma, Yuxing Ding, Min Zhang

Proceedings of the 30th ACM International Conference on Multimedia, 2022.

ACM MM
Fast and Robust Online Handwritten Chinese Character Recognition with Deep Spatial & Contextual Information Fusion Network

Yunxin Li, Qian Yang, Qingcai Chen, Baotian Hu, Xiaolong Wang, Yuxin Ding, Lin Ma

IEEE Transactions on Multimedia, 2022.

IEEE Transactions on Multimedia

Awards

2026: FRQNT Doctoral Scholarship, Fonds de recherche du Québec – Nature et technologies (25,000 CAD/year, 2026-2028)
2026: AI Scholarship, Université de Montréal (10,000 CAD)

Services

Reviewer: AAAI 2024, ECCV 2024, CVPR 2024, ACM Multimedia 2024, ACM Multimedia 2023, COLING 2022.

Educations

2023.09 - Present: Ph.D in Computer Science, Mila & Université de Montréal, Quebec, CA
2020.09 - 2023.03: Master of Science in Computer Science, Harbin Institute of Technology, Shenzhen, China
2016.09 - 2020.06: B.Eng., University of Electronic Science and Technology of China, Chengdu, China

Internships

2022.05 - 2022.10: Alibaba Damo Academy, Hangzhou, China