We all know that a traditional camera captures a 3D scene into a 2D image. What about the reverse? Is there a way to convert 2D images into a realistic 3D scene? AI engineers with NVIDIA Research are working on inverse rendering, which is a process that uses artificial intelligence to approximate how light behaves and then reconstruct a 3D scene from a ‘handful of 2D images taken at different angles.’ The NVIDIA Research team says that it has developed an approach to perform this task almost instantly. It’s one of the first models of its kind to utilize neural network training and deliver rapid rendering.
NVIDIA has applied this approach to neural radiance fields, also known as NeRF. NVIDIA says that its new approach, called Instant NeRF, is the fastest NeRF technique so far. In some cases, it’s around 1,000 times faster than other methods. The model can train on a few dozen still photos in ‘minutes,’ and Instant NeRF can render a resulting 3D scene in ‘tens of milliseconds.’
NeRFs use neural networks to render 3D scenes using 2D image inputs. For example, suppose you’re trying to capture photos of an individual from every angle. Now, figure that you capture a few dozen different angles, which of course doesn’t cover every possible view of the subject. From this collection of 2D images, a NeRF uses AI to fill in the blanks and then trains a neural network to reconstruct the overall scene in 3D. A NeRF predicts light in any direction from any point in 3D space.
What makes Instant NeRF different? NVIDIA writes, ‘While estimating the depth and appearance of an object based on a partial view is a natural skill for humans, it’s a demanding task for AI.’ This demand means that training early NeRF models took hours. Instant NeRF cuts rendering time by ‘several orders of magnitude’ using NVIDIA’s technique, multi-resolution hash grid encoding. The technique, which is optimized for NVIDIA GPUs, is much faster.
‘If traditional 3D representations like polygonal meshes are akin to vector images, NeRFs are like bitmap images: they densely capture the way light radiates from an object or within a scene,’ says David Luebke, vice president for graphics research at NVIDIA. ‘In that sense, Instant NeRF could be as important to 3D as digital cameras and JPEG compression have been to 2D photography — vastly increasing the speed, ease and reach of 3D capture and sharing.’
NVIDIA says that Instant NeRF ‘could be used to create avatars or scenes for virtual worlds, to capture video conference participants and their environments in 3D, or to reconstruct scenes for 3D digital maps.’ The Instant NeRF technology could also be used to train robots and self-driving cars to better understand surrounding real-world objects.
NVIDIA showcased Instant NeRF during its GTC 2022 keynote. If you’d like to watch NVIDIA CEO Jenson Huang’s entire keynote address, check it out above.