Wenhu Chen

Wenhu Chen [陈文虎 in Chinese]

Researcher in LLM/Agents/Multimodality

Email: wenhuchen [at] uwaterloo [dot] ca | hustchenwenhu [at] gmail [dot] com

Google Scholar / CV (updated in July 25) / Github / Twitter

Biography

Wenhu Chen is an assistant professor at the University of Waterloo (on Leave). He obtained the Canada CIFAR AI Chair Award in 2022. He worked at Google DeepMind from 2021 to 2025, where he contributed to the Gemini multimodel and evaluation efforts. Before that, he obtained his PhD from the CS department of the University of California, Santa Barbara. His research interest lies in natural language processing, deep learning, and multimodal learning. He aims to design models that handle complex reasoning scenarios, such as math problem-solving and knowledge grounding. He is also interested in building more powerful multimodal models to bridge different modalities. He won the prestigious Golden Jubilee Research Excellence Award at the University of Waterloo in 2025. His won the best paper award at TMLR 2025. He received the Area Chair Award in AACL-IJCNLP 2023, the Best Paper Honorable Mention in WACV 2021, and the UCSB CS Outstanding Dissertation Award in 2021.

Research Interest

My research interest covers the following aspects:

Reasoning
Information Retrieval
Benchmarks and Evaluation
Generative AI

Research Highlights

You might have heard of me because of the following work I conducted.

1. Natural Language Processing (LLMs)

Program-of-Thoughts: A prompting strategy to use tools to solve complex reasoning tasks
MAmmoTH/MAmmoTH2/General Reasoner: Advancing reasoning model to solve complex reasoning tasks
OpenCoderInterpreter/AceCoder: Advanced coding language models for complex tasks

2. Multimodal Understanding (Image & Video)

MuRAG/UniIR/MagicLens/VLM2Vec/VLM2Vec-V2: the framework to enable unified and compositional multimodal information retrieval
Mantis/MAmmoTH-VL/MAmmoTH-VL2: advanced vision-language models with better reasoning skills
VL-Rethinker/Pixel-Reasoner: advanced vision-language models with better reasoning skills

3. Multimodal Generation (Image & Video)

Re-Imagen/SuTI/Kosmos-G/Instruct-Imagen: the most effective and efficient and controllable image generation models
T2V-Turbo/T2V-Turbo-v2: efficient text-to-video generation models
MagicBrush/OmniEdit/AnyV2V: powerful image and video editing models

4. Benchmarks & Evaluation

MMMU/MMLU-Pro/TheoremQA/MEGA-Bench:: the commonly used language model and vision-language model evaluation suite
TabFact/HybridQA/OTT-QA: Table and text reasoning evaluation benchmarks

5. Others

MERT/ChatMusician/YuE: Foundation models for understanding and composing music
MAP-Neo/Fine-FineWeb: Fully open-source language models with high-quality pre-training datasets
KB-BINDER/TableCoT/StructLM: Grounding foundation models on structured knowledge
TheoremExplainAgent: Building agents for composing education videos

TIGER Lab

I direct the Text and Image GEnerative Research (TIGER) lab. My lab is focused on studying different generative models in different modalities including text, images, videos and music. We are committed to building powerful state-of-the-art models for various domains. Our lab is always looking for talented and self-motivated students.

Awards

2025: Outstanding Paper Award at TMLR 2025
2025: Math Golden Jubilee Award
2024: CVPR Best Paper Finalist
2023: AACL-IJCNLP23 Area Chair Award
2022: Canada CIFAR AI Chair
2021: UCSB CS Outstanding Dissertation Award
2021: WACV21 Best Student Paper Honorable Mention

Fundings and Grants

CIFAR AI Chair Award
- Title: Accessing Diverse Web Knowledge with Natural Language Interface
- Years: 2022 - 2027
- Amount: 725,000 CAD
NSERC Discovery
- Title: Building Semiparametric Models to Decouple Knowledge from Computation
- Years: 2023 - 2028
- Amount: 12,500 CAD
Mitacs Accelerate
- Title: Question Answering over Long Clinical Documents
- Years: 2024 - 2026
- Amount: 90,000 CAD
CIFAR AI Catalyst
- Title: Generating Images with Multimodal Instruction
- Years: 2024 - 2026
- Amount: 100,000 CAD
National Research Council Canada - AI4D Funding
- Title: Accelerating Scientific Discovery with Foundation Models
- Years: 2024 - 2026
- Amount: 118,000 CAD
National Research Council Canada - New Beginning
- Title: Building More Efficient Visual Generative Models
- Years: 2025 - 2026
- Amount: 25,000 CAD
CIAFR Solution Network - Safer AI for the Global South
- Title: A Citizen-Centered Co-Creating Dialect Bias Benchmarks, Mitigation Tools, and Policy Solutions
- Years: 2026 - 2028
- Amount: 700,000 CAD
CFI-JELF
- Title: Enriching the linguistic diversity of open language models
- Years: 2026 - 2030
- Amount: 336,000 CAD