Home  ·   Publications  ·   More


Zhongang Cai   蔡中昂


Hi there! I’m a Staff Research Scientist at SenseTime Research, working with Dr. Lei Yang. My research focuses on spatial intelligence, for which I lead the development of SenseNova-SI and EASI, advancing both scalable training and holistic evaluation for spatially capable models. As a side project, I also lead DLP3D, an open-source framework for real-time autonomous 3D characters powered by large language models. I received my Ph.D. from MMLab@NTU, advised by Prof. Ziwei Liu and Prof. Chen Change Loy, where I spent wonderful years exploring virtual humans.

Google Scholar X GitHub HuggingFace YouTube LinkedIn

News

[2026-02] Release of A Very Big Video Reasoning Suite (VBVR).

[2026-02] SenseNova-SI, ConsistCompose, and VLM-Guided HMR have been accepted to CVPR 2026.

[2026-01] ViMoGen has been accepted to ICLR 2026.

[2025-12] Invited talk on SenseNova-SI (slides and recording) at Plutons.

[2025-12] Invited talk on Embodied Intelligence (slides) at TriFusion Workshop.

[2025-11] Release of SenseNova-SI: Scaling Spatial Intelligence with Multimodal Foundation Models.

[2025-10] Release of the source code of DLP3D. Try it now at dlp3d.ai !

[2025-10] Digital Life Project 2 (DLP3D) has been accepted to SIGGRAPH Asia 2025 (Real-Time Live!).

[2025-10] SMPLest-X has been accepted to TPAMI 2025.

[2025-09] PoseFuse3D-KI has been accepted to NeurIPS 2025.

[2025-08] Release of EASI: Holistic Evaluation of Multimodal LLMs on Spatial Intelligence.

[2025-06] DPoser-X (Oral) has been accepted to ICCV 2025.

[2025-05] ADHMR has been accepted to ICML 2025.

[2025-02] SOLAMI, Disco4D, and EgoLife have been accepted to CVPR 2025.

[2025-01] MeshAnything has been accepted to ICLR 2025.

[2024-12] I have started a new role as a Staff Research Scientist at SenseTime Research.

[2024-09] Release of HuMMan v1.0: Motion Generation Subset and GTA-Human II Dataset.

[2024-08] Release of HuMMan v1.0: 3D Vision Subset.

[2024-08] GTA-Human has been accepted to TPAMI 2024 after two years of review!

[2024-07] WHAC and Large Motion Model have been accepted to ECCV 2024.

[2024-06] I have defended my Ph.D. thesis Scaling Up Parametric Human Recovery! 🎓

[2024-04] Invited talk (slides) at China3DV 2024.

[2024-02] Digital Life Project, AiOS, and GaussianEditor have been accepted to CVPR 2024.

[2024-02] Invited talk on SMPLer-X (recording) at International Digital Economy Academy (IDEA).

My Three Favorite Works [Full List]



A Very Big Video Reasoning Suite

Maijunxian Wang*, Ruisi Wang*, Juyi Lin*, Ran Ji*, Thaddäus Wiedemer, Qingying Gao, Dezhi Luo, Yaoyao Qian, Lianyu Huang, Zelong Hong, Jiahui Ge, Qianli Ma, Hang He, Yifan Zhou, Lingzi Guo, Lantao Mei, Jiachen Li, Hanwen Xing, Tianqi Zhao, Fengyuan Yu, Weihang Xiao, Yizheng Jiao, Jianheng Hou, Danyang Zhang, Pengcheng Xu, Boyang Zhong, Zehong Zhao, Gaoyun Fang, John Kitaoka, Yile Xu, Hua Xu, Kenton Blacutt, Tin Nguyen, Siyuan Song, Haoran Sun, Shaoyue Wen, Linyang He, Runming Wang, Yanzhi Wang, Mengyue Yang, Ziqiao Ma, Raphaël Millière, Freda Shi, Nuno Vasconcelos, Daniel Khashabi, Alan Yuille, Yilun Du, Ziming Liu, Bo Li, Dahua Lin, Ziwei Liu, Vikash Kumar, Yijiang Li, Lei Yang, Zhongang Cai✉, Hokin Deng✉.
Technical Report, 2026

Homepage PDF Data Model EvalKit Leaderboard

Scaling Spatial Intelligence with Multimodal Foundation Models

Zhongang Cai*, Ruisi Wang*, Chenyang Gu*, Fanyi Pu*, Junxiang Xu*, Yubo Wang*, Wanqi Yin*, Zhitao Yang*, Chen Wei*, Qingping Sun*, Tongxi Zhou*, Jiaqi Li*, Hui En Pang*, Oscar Qian*, Yukun Wei, Zhiqian Lin, Xuanke Shi, Kewang Deng, Xiaoyang Han, Zukai Chen, Xiangyu Fan, Hanming Deng, Lewei Lu, Liang Pan, Bo Li, Ziwei Liu✉, Quan Wang✉, Dahua Lin✉, Lei Yang*✉.
Computer Vision and Pattern Recognition (CVPR), 2026

PDF Code HuggingFace ModelScope

Holistic Evaluation of Multimodal LLMs on Spatial Intelligence
Previously: Has GPT-5 Achieved Spatial Intelligence? An Empirical Study

Zhongang Cai*, Yubo Wang*, Qingping Sun*, Ruisi Wang*, Chenyang Gu*, Wanqi Yin*, Zhiqian Lin*, Zhitao Yang*, Chen Wei*, Oscar Qian*, Hui En Pang*, Xuanke Shi, Kewang Deng, Xiaoyang Han, Zukai Chen, Jiaqi Li, Xiangyu Fan, Hanming Deng, Lewei Lu, Bo Li, Ziwei Liu, Quan Wang✉, Dahua Lin✉, Lei Yang*✉.
Technical Report, 2025

PDF Code Leaderboard