Existing depth estimation methods are fundamentally limited to predicting depth on discrete image grids. Such representations restrict their scalability to arbitrary output resolutions and hinder the geometric detail recovery. This paper introduces InfiniDepth, which represents depth as neural implicit fields. Through a simple yet effective local implicit decoder, we can query depth at continuous 2D coordinates, enabling arbitrary-resolution and fine-grained depth estimation. To better assess our method's capabilities, we curate a high-quality 4K synthetic benchmark from five different games, spanning diverse scenes with rich geometric and appearance details. Experiments demonstrate that InfiniDepth achieves SOTA performance on both synthetic and real-world benchmarks across relative and metric depth estimation tasks, particularly excelling in fine-detail regions. It also benefits the task of novel view synthesis under large viewpoint shifts, producing high-quality results with fewer holes and artifacts.
Pipeline of InfiniDepth:
Interactive Depth Map Visulization: Hover over the RGB image (left-side) to explore details in the 8K-resolution depth map (right-side). Use the mouse scroll wheel to zoom in/out on a specific patch.
Point Cloud Visualization: Point clouds from predicted depth maps. Use the mouse to zoom and rotate and use "Alt + mouse" to pan.
Novel View Synthesis (NVS): Single-view NVS results. The rendered video demonstrates the view synthesis quality under large viewpoint shifts.
We thank Yuanhong Yu, Gangwei Xu and Haoyu Guo for their insightful discussions and valuable suggestions, and Zhen Xu for his dedicated efforts in curating the synthetic data.
@article{yu2026infinidepth,
title={InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields},
author={Hao Yu, Haotong Lin, Jiawei Wang, Jiaxin Li, Yida Wang, Xueyang Zhang, Yue Wang, Xiaowei Zhou, Ruizhen Hu and Sida Peng},
booktitle={arXiv preprint},
year={2026}
}