CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout

Haotian Bai1, Yuanhuiyi Lyu1, Lutao Jiang1, Sijia Li2, Haonan Lu2, Xiaodong Lin2, Lin Wang1,3
1Hongkong University of Science and Technology(Guangzhou), 2OPPO, 3Hongkong University of Science and Technology

Componerf parses and regenerates consistent multi-object environments through text prompts and layouts. It allows for individual NeRFs, each denoted by a unique prompt color, to be composed, decomposed, and recomposed with ease. (a) displays the composed results. (b), (c), (d), (e) are recomposition results after manipulation demos shown above, including duplication, transformation, loading decomposed NeRFs, and semantic editing conducted separately.

Motivation

The guidance collapse issue. Generation of the multi-object scene involves utilizing the frozen Stable Diffusion. Instances of guidance collapse are observed when using the global text directly.

Comparison

Composition

Recomposition

Video

BibTeX

@misc{bai2024componerftextguidedmultiobjectcompositional,
        title={CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout}, 
        author={Haotian Bai and Yuanhuiyi Lyu and Lutao Jiang and Sijia Li and Haonan Lu and Xiaodong Lin and Lin Wang},
        year={2024},
        eprint={2303.13843},
        archivePrefix={arXiv},
        primaryClass={cs.CV},
        url={https://arxiv.org/abs/2303.13843}, 
  }