MetroGS: Efficient and Stable Reconstruction of Geometrically Accurate High-Fidelity Large-Scale Scenes

FIRST_AUTHOR_LAST, FIRST_AUTHOR_FIRST; SECOND_AUTHOR_LAST, SECOND_AUTHOR_FIRST

MetroGS: Efficient and Stable Reconstruction of Geometrically
Accurate High-Fidelity Large-Scale Scenes

Kehua Chen^1,2, Tianlu Mao^1,2, Zhuxin Ma³, Jiang Hao^1,2†, Zehao Li^1,2, Zihan Liu^1,2,
Shuqi Gao¹, Honglong Zhao¹, Feng Dai¹, Yucheng Zhang¹, Zhaoqi Wang^1,2

¹Institute of Computing Technology, Chinese Academy of Sciences, ICT
²University of Chinese Academy of Sciences, UCAS ³Beihang University

arXiv Code

Abstract

Recently, 3D Gaussian Splatting and its derivatives have achieved significant breakthroughs in large-scale scene reconstruction. However, how to efficiently and stably achieve high-quality geometric fidelity remains a core challenge. To address this issue, we introduce MetroGS, a novel Gaussian Splatting framework for efficient and robust reconstruction in complex urban environments. Our method is built upon a distributed 2D Gaussian Splatting representation as the core foundation, serving as a unified backbone for subsequent modules. To handle potential sparse regions in complex scenes, we propose a structured dense enhancement scheme that utilizes SfM priors and a pointmap model to achieve a denser initialization, while incorporating a sparsity compensation mechanism to improve reconstruction completeness. Furthermore, we design a progressive hybrid geometric optimization strategy that organically integrates monocular and multi-view optimization to achieve efficient and accurate geometric refinement. Finally, to address the appearance inconsistency commonly observed in large-scale scenes, we introduce a depth-guided appearance modeling approach that learns spatial features with 3D consistency, facilitating effective decoupling between geometry and appearance and further enhancing reconstruction stability. Experiments on large-scale urban datasets demonstrate that MetroGS achieves superior geometric accuracy, rendering quality, offering a unified solution for high-fidelity large-scale scene reconstruction.

Overview — Illustration of the superiority of our method. (a) Our method accurately reconstructs the geometric structure of large-scale urban scenes, faithfully restoring fine details such as buildings, vegetation, and roads. (b) Compared with the SOTA method CityGSV2, our result are more complete and geometrically precise. (c) Benefiting from a well-designed training framework, our method achieves superior convergence speed and geometric quality. On four RTX 3090 GPUs, our method reaches better performance with less than 25% of the training time required by CityGSV2.

Comparison with other advanced reconstruction methods on the GauU-Scene dataset. P and R indicate the Precision and Recall with respect to the ground-truth point cloud. "NaN" means no results due to NaN error. "FAIL" means fail to extract meaningful mesh. MetroGS achieves state-of-the-art performance across all metrics.

Left: Overall comparison with CityGSV2 on the GauU-Scene dataset. Right: Comparison with other methods on the MatrixCity dataset. MetroGS demonstrates excellent training efficiency while achieving strong performance on synthetic scenes.

Depth — Qualitative results on the GauU-Scene dataset. MetroGS produces more accurate and complete depth, with improved robustness to appearance inconsistencies and fewer artifacts.

Mesh — Compared with other state-of-the-art approaches, MetroGS converges more rapidly and produces higher-quality mesh results overall.

BibTeX

@article{chen2025metrogs,
  title={MetroGS: Efficient and Stable Reconstruction of Geometrically Accurate High-Fidelity Large-Scale Scenes},
  author={Chen, Kehua and Mao, Tianlu and Ma, Zhuxin and Jiang, Hao and Li, Zehao and Liu, Zihan and Gao, Shuqi and Zhao, Honglong and Dai, Feng and Zhang, Yucheng and others},
  journal={arXiv preprint arXiv:2511.19172},
  year={2025}
}