Floating No More: Object-Ground Reconstruction from a Single Image


University of Illinois Urbana-Champaign   Purdue University   Adobe

arXiv Video BibteX
teaser Our proposed ORG (Object Reconstruction with Ground) model simultaneously reconstructs a 3D object, estimates camera parameters, and models the object-ground relationship from a monocular image. During shadow and reflection generation, the prior depth-based object geometry estimation method can result in floating issue or an unnatural shadow on the ground, as demonstrated in red boxes. Our method, on the other hand, achieves significantly more realistic editing and generation, as shown in blue boxes.

Overview

Recent advancements in 3D object reconstruction from single images have primarily focused on improving the accuracy of object shapes. Yet, these techniques often fail to accurately capture the inter-relation between the object, ground, and camera. As a result, the reconstructed objects often appear floating or tilted when placed on flat surfaces. This limitation significantly affects 3D-aware image editing applications like shadow rendering and object pose manipulation. To address this issue, we introduce ORG (Object Reconstruction with Ground), a novel task aimed at reconstructing 3D object geometry in conjunction with the ground surface. Our method uses two compact pixel-level representations to depict the relationship between camera, object, and ground. In this work, we

  • Propose a novel framework for in-the-wild single-view object-ground 3D geometry estimation. To the best of our knowledge, this is the first method to jointly model object, camera, and ground plane from single image.
  • We propose a perspective field guided pixel height re-projection module to efficiently convert our estimated representations into common depth maps and point clouds.
  • ORG achieves outstanding shadow generation and reconstruction performance on unseen real-world images, demonstrating great robustness and generalization ability.

Results

prior ORG maintains great object-ground relationship compared with prior methods, leading to much more realistic shadow and reflection, as shown in the blue boxes. ORG easily outputs representations like depth map and point cloud.

More Results in Reconstruction

pipeline Visualization of ORG outputs, including pixel height, (foreground) perspective fields, depth map, and object-ground reconstruction. ORG generalizes to various categories of objects.

BibTeX

If you find our code and paper helpful, please consider citing our work:

@article{man2024org,
      title={Floating No More: Object-Ground Reconstruction from a Single Image},
      author={Man, Yunze and Sheng, Yichen and Zhang, Jianming and Gui, Liang-Yan and Wang, Yu-Xiong},
      journal={arXiv preprint arXiv:2407.18914},
      year={2024}
    }