ViewCrafter takes advantage of the powerful generation capabilities of video diffusion models and the coarse 3D clues offered by point-based representation to generate high-quality video frames with precise camera pose control.
ViewCrafter can generate high-fidelity novel views of generic scenes from single or sparse images using video diffusion models.
ViewCrafter uses point-based representation to provide coarse 3D clues for generating high-quality video frames with precise camera pose control.
ViewCrafter adopts an iterative view synthesis strategy that involves iteratively moving cameras, generating novel views, and updating the point cloud.
ViewCrafter can facilitate more consistent 3D-GS optimization by progressively completing the initial point cloud and synthesizing novel views.
ViewCrafter can facilitate immersive experiences with real-time rendering by efficiently optimizing a 3D-GS representation using the reconstructed 3D points and the generated novel views.
Immersive experiences with real-time rendering
Scene-level text-to-3D generation
More imaginative content creation
High-fidelity novel view synthesis for generic scenes
Point cloud rendering and completion
Input a single reference image or sparse image sets
Build a point cloud representation using a dense stereo model
Train a point-conditioned video diffusion model for enhanced rendering
Adopt an iterative view synthesis strategy for long-range novel view synthesis
Use the completed dense point cloud to initialize 3D-GS and employ the synthesized novel views to supervise 3D-GS training