3DGRT

¹ NVIDIA

² University of Toronto

^*,† Equal Contribution

SIGGRAPH Asia 2024 (Journal Track)

history_edu arXiv description Paper play_circle Video

Abstract

Particle-based representations of radiance fields such as 3D Gaussian Splatting, have found great success for reconstructing and re-rendering of complex scenes. Most existing methods render particles via rasterization, projecting them to screen space tiles for processing in a sorted order. This work instead considers ray tracing the particles, building a bounding volume hierarchy and casting a ray for each pixel using high-performance GPU ray tracing hardware. To efficiently handle large numbers of semi-transparent particles, we describe a specialized rendering algorithm which encapsulates particles with bounding meshes to leverage fast ray-triangle intersections, and shades batches of intersections in depth-order. The benefits of ray tracing are well-known in computer graphics: processing incoherent rays for secondary lighting effects such as shadows and reflections, rendering from highly-distorted cameras common in robotics, stochastically sampling rays, and more. With our renderer, this flexibility comes at little cost compared to rasterization. Experiments demonstrate the speed and accuracy of our approach, as well as several applications in computer graphics and vision. We further propose related improvements to basic Gaussian representation, including a simple use of generalized kernel functions which significantly reduces particle hit counts.

Given a set of 3D particles, we first build the corresponding bounding primitives and insert them into a BVH. To compute the incoming radiance along each ray, we trace rays against the BVH to get the next k particles. We then compute the intersected particles' response and accumulate the radiance. The process repeats until all particles have been evaluated or the transmittance meets a predefined threshold and the final rendering is returned.

Novel view synthesis

For the task of novel view synthesis, our method performs on par with or slightly better than 3D Gaussian Splatting and other state-of-the-art methods in terms of quality.

Real Scenes

high-quality video

Synthetic Scenes

high-quality video

Comparison to 3D Gaussian Splatting (3DGS)

Training w/ Undistorted Views (Pinhole) vs. Original Views (Fisheye)

As mentioned throughout our paper, unlike rasterization, ray tracing does not require perfect pinhole cameras for supervising scenes. Instead, ray tracing supports training with other distorted camera models. This allows us to eliminate the rectification step used in the preprocessing stage of 3D Gaussian Splatting, leading to higher quality outputs by supervising on all available pixels rather than just a subset in the rectified cameras. Below, we compare training our model on two scenes from the ZipNeRF dataset: once using the undistorted (pinhole) views, and once using the original fisheye cameras. The results clearly show that training with the original distorted views produces higher quality outputs.

Testing on distorted views:

Testing on undistorted views:

Secondary effects and instancing

Efficient ray tracing enables many advanced techniques in computer graphics and vision, such as secondary ray effects like mirrors, refractions, and shadows. It also supports highly distorted cameras with rolling shutter effects and even allows for stochastic sampling of rays.

Supervision from highly-distorted cameras

Compared to rasterization-based approaches, our ray tracing-based method naturally supports complex camera models, such as distorted fisheye lenses. These can be re-rendered using different camera models, like regular perspective cameras, achieving high reconstruction quality relative to unseen references. Ray tracing also naturally compensates for time-dependent effects like rolling shutter distortions caused by sensor motion. This effect is shown on an example below of a single solid box rendered by a left-and-right-panning rolling shutter camera with a top-to-bottom shutter direction. By incorporating time-dependent per-pixel poses in the reconstruction, our method accurately recovers the true undistorted geometry.

Ground Truth

Ours

Ground Truth (rolling shutter affected)

Ours (rolling shutter compensated)

AV scene reconstruction

Real-world AV and robotics applications often need to account for distorted intrinsic camera models and time-dependent effects like rolling shutter distortions caused by high sensor speeds. Our ray tracing-based reconstruction is well-suited to handle both challenges simultaneously.

Left Camera

Middle Camera

Right Camera

Large-scale scene

BibTeX

@article{3dgrt2024,
    author = {Nicolas Moenne-Loccoz and Ashkan Mirzaei and Or Perel and Riccardo de Lutio and Janick Martinez Esturo and Gavriel State and Sanja Fidler and Nicholas Sharp and Zan Gojcic},
    title = {3D Gaussian Ray Tracing: Fast Tracing of Particle Scenes},
    journal = {ACM Transactions on Graphics and SIGGRAPH Asia},
    year = {2024},
}