Yunze Man 「满运泽」

I am a Ph.D. student in Computer Science at the University of Illinois Urbana-Champaign, advised by Yuxiong Wang and Liangyan Gui. My research is generously supported by the NVIDIA PhD Fellowship. I received M.S. in Robotics at Carnegie Mellon University, advised by Kris Kitani. I received my B.S. in Computer Science from Zhejiang University.

My research interests lie at the intersection of vision, machine learning, and robotics. I am working on developing vision-centric reasoning models for multimodal and embodied AI agents, with a focus on object-centric perception systems in dynamic scenes, vision foundation models for open-world scene understanding and generation, and large multimodal models for embodied reasoning and robotics planning

Email:
[Google Scholar] [Github] [Twitter]

News

[12/2024] Received the NVIDIA Graduate Fellowship 2025.
[11/2024] Selected as one of the Top Reviewers in NeurIPS 2024.
[09/2024] Lexicon3D accepted to NeurIPS 2024!
[09/2024] SceneCraft accepted to NeurIPS 2024!
[05/2024] Selected as one of the Outstanding Reviewers in CVPR 2024.
[05/2024] Started my internship at NVIDIA Research. Look forward to seeing you in Bay Area!
[01/2024] LLM4Vision accepted to ICLR 2024 (Spotlight)!
[02/2023] I passed the qualifying exam and officially became a Ph.D. candidate!

Selected Publications

Please refer to my Google Scholar profile for the full list of publications.

(* indicates equal contribution)

GR00T N1.5: An Improved Open Foundation Model for Generalist Humanoid Robots

GR00T Team

Blog Post / Project / Model Card

Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought

Yunze Man, De-An Huang, Guilin Liu, Shiwei Sheng, Shilong Liu, Liangyan Gui, Jan Kautz, Yu-Xiong Wang, Zhiding Yu

CVPR 2025 / Paper / Project

RandAR: Decoder-only Autoregressive Visual Generation in Random Orders

Ziqi Pang*, Tianyuan Zhang*, Fujun Luan, Yunze Man, Hao Tan, Kai Zhang, William T. Freeman, Yu-Xiong Wang

CVPR 2025 / Paper / Project

Oral presentation

Floating No More: Object-Ground Reconstruction from a Single Image

Yunze Man, Yichen Sheng, Jianming Zhang, Liang-Yan Gui, Yu-Xiong Wang

CVPR 2025 / Paper / Project

Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Reasoning

Yunze Man, Shuhong Zheng, Zhipeng Bao, Martial Hebert, Liang-Yan Gui, Yu-Xiong Wang

NeurIPS 2024 / Paper / Project

SceneCraft: Layout-Guided 3D Scene Generation

Xiuyu Yang*, Yunze Man*, Junkun Chen, Yu-Xiong Wang

NeurIPS 2024 / Paper / Project

LLM4Vision: Frozen Transformers from Language Models are Effective Visual Encoder Layers

Ziqi Pang, Ziyang Xie*, Yunze Man*, Yu-Xiong Wang

ICLR 2024 / Paper / Project

Spotlight presentation

SituationVLM: Situational Awareness Matters in 3D Vision Language Reasoning

Yunze Man, Liang-Yan Gui, Yu-Xiong Wang

CVPR 2024 / Paper / Project

DualCross: Cross-Modality Cross-Domain Adaptation for Monocular BEV Perception

Yunze Man, Liang-Yan Gui, Yu-Xiong Wang

IROS 2023 / Paper / Project

BEV-Guided Multi-Modality Fusion for Driving Perception

Yunze Man, Liang-Yan Gui, Yu-Xiong Wang

CVPR 2023 / Paper / Project

Internship Experience

[05/2024 ~ 08/2024], NVIDIA Research, Research Intern, hosted by Zhiding Yu, De-An Huang, and Gulin Liu
[05/2022 ~ 01/2023], Adobe Research, Research Intern, hosted by Jianming Zhang

Professional Service

Reviewer for CVPR, ECCV, ICCV, ICLR, NeurIPS, ICML, AAAI, IROS, ICRA, TMLR
2021 - 2024
Teaching Assisant

Learining to Learn (CS598), UIUC
Fall 2022
Efficient & Predictive Vision (CS598), UIUC
Spring 2022
Machine Learning (CS446), UIUC
Fall 2021
Computer Vision Capstone (16-621), CMU
Spring 2020, 2021

Contact

University of Illinois Urbana-Champaign & Computer Science Department
201 N Goodwin Ave
Urbana, IL 61801