Yunze Man

News

[06/2026] Defended my PhD thesis “Visually Grounded Multimodal Models for Spatial Intelligence”!
[05/2026] Joined NVIDIA GEAR as a Research Scientist.
[03/2026] Gave a Lightning Talk on LocateAnything3D at NVIDIA GTC 2026.
[02/2026] LocateAnything3D and Fast-ThinkAct accepted to CVPR 2026; Capturing accepted to ICLR 2026.
[08/2025] Started internship at NVIDIA GEAR. Excited to work on Generalist Robotics!
[12/2024] Received the NVIDIA Graduate Fellowship 2025.
[11/2024] Selected as one of the Top Reviewers in NeurIPS 2024.
[09/2024] Lexicon3D accepted to NeurIPS 2024!
[09/2024] SceneCraft accepted to NeurIPS 2024!
[05/2024] Selected as one of the Outstanding Reviewers in CVPR 2024.
[05/2024] Started my internship at NVIDIA Research. Look forward to seeing you in Bay Area!
[01/2024] LLM4Vision accepted to ICLR 2024 (Spotlight)!
[02/2023] I passed the qualifying exam and officially became a Ph.D. candidate!

Show more ▾

Selected Publications

Please refer to my Google Scholar profile for the full list of publications.

(* indicates equal contribution)

Vesta: A Generalist Embodied Reasoning Model

Johan Bjorck*, Zhiqi Li*, Yunze Man*, Jing Wang*, An-Chieh Cheng, Sifei Liu, Shihao Wang, Zhiding Yu, et al.

Tech Report / Paper / Project

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

Shihao Wang, Shilong Liu, Yuanguo Kuang, Xinyu Wei, Yangzhou Liu, Zhiqi Li, Yunze Man, Guo Chen, Andrew Tao, Guilin Liu, Jan Kautz, Lei Zhang, Zhiding Yu

ECCV 2026 / Paper / Project

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Chi-Pin Huang, Yunze Man, Zhiding Yu, Min-Hung Chen, Jan Kautz, Yu-Chiang Frank Wang, Fu-En Yang

CVPR 2026 / Paper / Project

LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight

Yunze Man, Shihao Wang, Guowen Zhang, Johan Bjorck, Zhiqi Li, Liang-Yan Gui, Jim Fan, Jan Kautz, Yu-Xiong Wang, Zhiding Yu

CVPR 2026 / Paper / Project

Capturing Visual Environment Structure Correlates with Control Performance

Jiahua Dong, Yunze Man, Pavel Tokmakov, Yu-Xiong Wang

ICLR 2026 / Paper / Project

GR00T N1.6: An Improved Open Foundation Model for Generalist Humanoid Robots

GR00T Team

Blog Post / Project / Model Card

Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought

Yunze Man, De-An Huang, Guilin Liu, Shiwei Sheng, Shilong Liu, Liangyan Gui, Jan Kautz, Yu-Xiong Wang, Zhiding Yu

CVPR 2025 / Paper / Project

RandAR: Decoder-only Autoregressive Visual Generation in Random Orders

Ziqi Pang*, Tianyuan Zhang*, Fujun Luan, Yunze Man, Hao Tan, Kai Zhang, William T. Freeman, Yu-Xiong Wang

CVPR 2025 / Paper / Project

Oral presentation

Floating No More: Object-Ground Reconstruction from a Single Image

Yunze Man, Yichen Sheng, Jianming Zhang, Liang-Yan Gui, Yu-Xiong Wang

CVPR 2025 / Paper / Project

Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Reasoning

Yunze Man, Shuhong Zheng, Zhipeng Bao, Martial Hebert, Liang-Yan Gui, Yu-Xiong Wang

NeurIPS 2024 / Paper / Project

SceneCraft: Layout-Guided 3D Scene Generation

Xiuyu Yang*, Yunze Man*, Junkun Chen, Yu-Xiong Wang

NeurIPS 2024 / Paper / Project

LLM4Vision: Frozen Transformers from Language Models are Effective Visual Encoder Layers

Ziqi Pang, Ziyang Xie*, Yunze Man*, Yu-Xiong Wang

ICLR 2024 / Paper / Project

Spotlight presentation

SituationVLM: Situational Awareness Matters in 3D Vision Language Reasoning

Yunze Man, Liang-Yan Gui, Yu-Xiong Wang

CVPR 2024 / Paper / Project

Industry Experience

[05/2026 ~ Present], NVIDIA Generalist Embodied Agent Research (GEAR), Research Scientist
[12/2024 ~ 03/2026], OpenAGI Foundation, Founding Researcher
[08/2025 ~ 03/2026], NVIDIA Generalist Embodied Agent Research (GEAR), Research Scientist Intern
[05/2024 ~ 12/2024], NVIDIA Learning and Perception Research, Research Scientist Intern
[05/2022 ~ 08/2022], Adobe Research, Research Scientist Intern

Professional Service

Reviewer for CVPR, ECCV, ICCV, ICLR, NeurIPS, ICML, AAAI, IROS, ICRA, TMLR, WACV
2021 - 2026
Teaching Assisant

Learining to Learn (CS598), UIUC
Fall 2022
Efficient & Predictive Vision (CS598), UIUC
Spring 2022
Machine Learning (CS446), UIUC
Fall 2021
Computer Vision Capstone (16-621), CMU
Spring 2020, 2021

Yunze Man 「满运泽」

News

Selected Publications

Industry Experience

Professional Service