作者:Amit Raj Srinivas Kaza Ben Poole Michael Niemeyer Nataniel Ruiz Ben Mildenhall Shiran Zada Kfir Aberman Michael Rubinstein Jonathan Barron Yuanzhen Li Varun Jampani
我们展示了DreamBooth3D,这是一种从3-6张随意拍摄的受试者图像中个性化文本到3D生成模型的方法。我们的方法结合了个性化文本到图像模型(DreamBooth)和文本到3D生成(DreamFusion)的最新进展。我们发现,由于个性化的文本到图像模型过度适应受试者的输入视点,天真地结合这些方法无法产生令人满意的受试者特定的3D资产。我们通过三阶段优化策略克服了这一问题,在该策略中,我们共同利用神经辐射场的3D一致性以及文本到图像模型的个性化能力。我们的方法可以通过文本驱动的修改来生成高质量的、特定于主题的3D资产,例如在主题的任何输入图像中都看不到的新颖姿势、颜色和属性。
We present DreamBooth3D, an approach to personalize text-to-3D generativemodels from as few as 3-6 casually captured images of a subject. Our approachcombines recent advances in personalizing text-to-image models (DreamBooth)with text-to-3D generation (DreamFusion). We find that naively combining thesemethods fails to yield satisfactory subject-specific 3D assets due topersonalized text-to-image models overfitting to the input viewpoints of thesubject. We overcome this through a 3-stage optimization strategy where wejointly leverage the 3D consistency of neural radiance fields together with thepersonalization capability of text-to-image models. Our method can producehigh-quality, subject-specific 3D assets with text-driven modifications such asnovel poses, colors and attributes that are not seen in any of the input imagesof the subject.
论文链接:http://arxiv.org/pdf/2303.13508v1
更多计算机论文:http://cspaper.cn/