Carnegie Mellon University
April 22, 2025

3D Scene Reconstruction Without Camera Motion

By Ashlyn Lacovara

Carnegie Mellon University and Fujitsu Research have released a new approach that dramatically improves 3D scene reconstruction from 2D images, even in highly dynamic environments with minimal camera movement. Instead of relying on large datasets to guess how things move, the team used a classic tracking method (Lucas-Kanade) and adapted it for dynamic Gaussian Splatting. This let them accurately track motion and changes in a scene by using math to calculate how every point moves.

The researchers built on a foundational concept from the Lucas-Kanade method, which estimates how pixels move between video frames—a process known as optical flow. They extended this idea to the warp field used in Gaussian Splatting, a mathematical function that determines how individual Gaussians move through 3D space. By analyzing this warp field, they derived a velocity field that describes both the direction and speed of each point in the scene. Through time integration, they were able to track these points across multiple frames, resulting in a precise estimate of scene flow—the 3D equivalent of optical flow. This approach enables accurate modeling of motion in dynamic scenes, even when the camera itself doesn’t move significantly.

This approach is exciting because it overcomes some of the biggest limitations in 3D scene reconstruction—namely, the need for a moving camera and large, data-driven models trained on specific types of scenes. Traditional methods rely heavily on motion parallax, the effect where objects closer to the camera appear to move faster than those farther away, helping systems estimate depth as the camera moves. However, when the camera stays relatively still, those depth cues disappear, making reconstruction much harder. Instead of depending on motion parallax or biased training priors, this method analytically computes how objects move in 3D space, enabling it to handle highly dynamic environments where people, vehicles, or other objects are in motion. By adapting classical optical flow techniques to the warp field in Gaussian Splatting and deriving a velocity field through time integration, the researchers achieved accurate scene flow estimation even in the absence of significant camera movement. Their results show strong performance across both synthetic and real-world datasets, highlighting the method's versatility and real-world potential for applications like robotics, AR/VR, and video analysis.

gslk_results.gif

The significance of this work lies not only in its technical innovation but also in its potential to expand the capabilities of 3D scene understanding in real-world conditions—where camera motion is limited and environments are dynamic. By removing the dependence on learned priors and leveraging a purely analytical approach, this method opens new possibilities for robotics, autonomous systems, and immersive technologies. The research will be formally presented at the International Conference on Learning Representations (ICLR) 2025, where the team will share their methodology, experimental results, and potential applications in robotics and dynamic scene reconstruction.