Weihang Li
M.Sc. Weihang Li
- Phone: -
- E-mail: weihang.li@tum.de
Personal Website: https://colin-de.github.io/ Raum: STC
Research Interests
- World Model
- Embodied AI / Robotics
- 3D / 4D Vision (Object / Scene-level Pose, Depth)
Feel free to contact me about collaborating on the above topics.
Some recent work: OPT-Pose(CVPR2026), TRICKY-Housecat(ICCV2025W), GCE-Pose(CVPR2025), AFT-CT(In submission to TRO), DynSUP(In Submission to TIP), SCRREAM(NeurIPS2024), Texture2LoD(CVPR2025W), Kb-PbD (IROS2024)
Professional Services
- Reviewer for the CVPR, ICCV, NeurIPS, ECCV, ICRA, IROS, BMVC, RAL, TRO, ACL
- Challenges and Workshops:
- Organize 1st ICCV Workshop and Challenge on Category-Level Object Pose Estimation in the Wild @ ICCV 2025
- Organize HouseCat-Tricky Challenge with Workshop on Transparent & Reflective objects In the wild@ ICCV 2025
Curriculum Vitae
Hi, I'm a PhD student with TUM & MCML supervised by Prof. Benjamin Busam and Prof. Nassir Navab. During my Master's study at TUM Robotics Cognition, Intelligence, I conducted research at CAMP, fortiss, Photogrammetry and Remote Sensing with Prof. Olaf Wysocki , HKUST-GZ with Prof. Haoang Li and CVG with Prof. Daniel Cremers.
Student Projects
If you are interested in the topics on Embodied AI, 3D Foundational Model, and want to do a research project in Master Thesis / IDP / Guided Research / HiWi with us, feel free to reach out at any time with your CV and transcript :) We always welcome motivated students aiming for top-tier conferences and journals such as: CVPR/ICCV/NeurIPS/IJCV/TIP/ICRA, and will offer support with computation resources (A100/H100/RTX5090) and robotics hardware(Franka).
Open topics:
[0] Long-term 4D Dynamic Object Reconstruction and Pose Estimation
The current feed-forward geometric foundation model supports reasoning about object geometry and pose, achieving strong performance on novel-object generalization across several benchmarks. The project targets a more realistic, everyday application for long-term tracking and pose understanding.
Reference: arxiv.org/abs/2603.23370
[1] Structured Semantics 3D Reconstruction
Standard 3D reconstruction methods produce raw geometry: point clouds or meshes. The result is not a representation from which measurements can be directly extracted. This project targets a more useful output: given ground-level imagery of a building, produce a structured wireframe from which meaningful quantities can be read off directly - roof dimensions, slope angles, edge classifications. Such a representation has clear practical value, for example, in estimating roof geometry and identifying optimal solar panel placement automatically.
Reference: https://arxiv.org/abs/2503.08208
[2] SLAM Foundation Models (in collaboration with Cambridge)
Our goal is to investigate how emerging 3D foundation models can be extended to support adaptive and continuous online scene mapping.
This research aims to bridge large-scale 3D representation learning with real-time SLAM and robotic perception in dynamic environments.
Some related references:
arxiv.org/abs/2512.25008
any-4d.github.io
https://arxiv.org/pdf/2601.09499
We accept remote collaboration, visiting Cambridge, or a Master’s Thesis in our group.
[3] Uncertainty-aware 4D World Models (in collaboration with Cambridge)
Our goal is to investigate how to introduce uncertainty modeling under dynamic and incomplete observation conditions to build more robust 4D world models. This research aims to extend existing 4D reconstruction frameworks to achieve more reliable state estimation and future prediction in real-world scenarios with noise, occlusion, and sensor degradation.Some relevant references:
https://www.robots.ox.ac.uk/~vgg/research/vdpm/
https://st4rtrack.github.io/
any-4d.github.io
Teaching
Awards
- Honorable Mention Award in the S23DR Challenge at CVPR 2024.
- Mentored Student Haoliang Huang win the 1st first place in the WLCOP Challenge at ICCV 2025
Publications
See Google Scholar