Grid-guided Neural Radiance Fields for Large Urban Scenes

Xu, Linning; Xiangli, Yuanbo; Peng, Sida; Pan, Xingang; Zhao, Nanxuan; Theobalt, Christian; Dai, Bo; Lin, Dahua

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/CVPR52729.2023.00802
Scopus: eid_2-s2.0-85173966963
Find via

Supplementary

Citations:
- Scopus: 33
Appears in Collections:
- HKU Musketeers Foundation Institute of Data Science: Conference papers

Conference Paper: Grid-guided Neural Radiance Fields for Large Urban Scenes

Title	Grid-guided Neural Radiance Fields for Large Urban Scenes
Authors	Xu, Linning Xiangli, Yuanbo Peng, Sida Pan, Xingang Zhao, Nanxuan Theobalt, Christian Dai, Bo Lin, Dahua
Keywords	3D from multi-view and sensors
Issue Date	2023
Citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2023, v. 2023-June, p. 8296-8306 How to Cite? DOI: http://dx.doi.org/10.1109/CVPR52729.2023.00802
Abstract	Purely MLP-based neural radiance fields (NeRF-based methods) often suffer from underfitting with blurred renderings on large-scale scenes due to limited model capacity. Recent approaches propose to geographically divide the scene and adopt multiple sub-NeRFs to model each region individually, leading to linear scale-up in training costs and the number of sub-NeRFs as the scene expands. An alternative solution is to use a feature grid representation, which is computationally efficient and can naturally scale to a large scene with increased grid resolutions. However, the feature grid tends to be less constrained and often reaches suboptimal solutions, producing noisy artifacts in renderings, especially in regions with complex geometry and texture. In this work, we present a new framework that realizes high-fidelity rendering on large urban scenes while being computationally efficient. We propose to use a compact multi-resolution ground feature plane representation to coarsely capture the scene, and complement it with positional encoding inputs through another NeRF branch for rendering in a joint learning fashion. We show that such an integration can utilize the advantages of two alternative solutions: a light-weighted NeRF is sufficient, under the guidance of the feature grid representation, to render photorealistic novel views with fine details; and the jointly optimized ground feature planes, can meanwhile gain further refinements, forming a more accurate and compact feature space and output much more natural rendering results.
Persistent Identifier	http://hdl.handle.net/10722/352386
ISSN	1063-6919 2023 SCImago Journal Rankings: 10.331

DC Field	Value	Language
dc.contributor.author	Xu, Linning	-
dc.contributor.author	Xiangli, Yuanbo	-
dc.contributor.author	Peng, Sida	-
dc.contributor.author	Pan, Xingang	-
dc.contributor.author	Zhao, Nanxuan	-
dc.contributor.author	Theobalt, Christian	-
dc.contributor.author	Dai, Bo	-
dc.contributor.author	Lin, Dahua	-
dc.date.accessioned	2024-12-16T03:58:37Z	-
dc.date.available	2024-12-16T03:58:37Z	-
dc.date.issued	2023	-
dc.identifier.citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2023, v. 2023-June, p. 8296-8306	-
dc.identifier.issn	1063-6919	-
dc.identifier.uri	http://hdl.handle.net/10722/352386	-
dc.description.abstract	Purely MLP-based neural radiance fields (NeRF-based methods) often suffer from underfitting with blurred renderings on large-scale scenes due to limited model capacity. Recent approaches propose to geographically divide the scene and adopt multiple sub-NeRFs to model each region individually, leading to linear scale-up in training costs and the number of sub-NeRFs as the scene expands. An alternative solution is to use a feature grid representation, which is computationally efficient and can naturally scale to a large scene with increased grid resolutions. However, the feature grid tends to be less constrained and often reaches suboptimal solutions, producing noisy artifacts in renderings, especially in regions with complex geometry and texture. In this work, we present a new framework that realizes high-fidelity rendering on large urban scenes while being computationally efficient. We propose to use a compact multi-resolution ground feature plane representation to coarsely capture the scene, and complement it with positional encoding inputs through another NeRF branch for rendering in a joint learning fashion. We show that such an integration can utilize the advantages of two alternative solutions: a light-weighted NeRF is sufficient, under the guidance of the feature grid representation, to render photorealistic novel views with fine details; and the jointly optimized ground feature planes, can meanwhile gain further refinements, forming a more accurate and compact feature space and output much more natural rendering results.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	-
dc.subject	3D from multi-view and sensors	-
dc.title	Grid-guided Neural Radiance Fields for Large Urban Scenes	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/CVPR52729.2023.00802	-
dc.identifier.scopus	eid_2-s2.0-85173966963	-
dc.identifier.volume	2023-June	-
dc.identifier.spage	8296	-
dc.identifier.epage	8306	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Grid-guided Neural Radiance Fields for Large Urban Scenes

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats