BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation
Yuanhong Yu1,2 • Xingyi He1 • Chen Zhao3 • Junhao Yu4 • Jiaqi Yang5 • Ruizhen Hu6 • Yujun Shen2 • Xing Zhu2 • Xiaowei Zhou1 • Sida Peng1
1State Key Lab of CAD & CG, Zhejiang University 2Ant Group 3EPFL 4Chongqing University 5Northwestern Polytechnical University 6Shenzhen University
This interactive demo allows you to estimate 6DoF object poses with in-the-wild images.
Try an Example Video
🎨 Annotation Tools
1 15
Reconstructor
- Point Mode: Click to add foreground points
- Bounding Box Mode: First click sets top-left, second click sets bottom-right
- Only one object is supported for annotation and prediction
- Camera intrinsics are estimated from reference images using DUSt3R
🎨 Reference Annotation Tools
🎨 Query Annotation Tools
Try an Example (Reference Images + Query Video)
Upload Reference Images | Upload Query Video |
---|
Reconstructor
- Only one object is supported for annotation and object pose prediction
- Camera intrinsics are assumed different between reference views and query video
- Intrinsics for query video are estimated from its first frame using DUSt3R
- Use the same annotation approach (points or bbox) for both reference and query
📦✨ BoxDreamer Demo - Estimating 6DoF object pose in the wild!
© 2025 - Built with Gradio | Powered by ZJU3DV