MetroGS: Efficient and Stable Reconstruction of Geometrically
Accurate High-Fidelity Large-Scale Scenes

Kehua Chen1,2, Tianlu Mao1,2, Zhuxin Ma3, Jiang Hao1,2†, Zehao Li1,2, Zihan Liu1,2,
Shuqi Gao1, Honglong Zhao1, Feng Dai1, Yucheng Zhang1, Zhaoqi Wang1,2
1Institute of Computing Technology, Chinese Academy of Sciences, ICT
2University of Chinese Academy of Sciences, UCAS   3Beihang University

Abstract

Recently, 3D Gaussian Splatting and its derivatives have achieved significant breakthroughs in large-scale scene reconstruction. However, how to efficiently and stably achieve high-quality geometric fidelity remains a core challenge. To address this issue, we introduce MetroGS, a novel Gaussian Splatting framework for efficient and robust reconstruction in complex urban environments. Our method is built upon a distributed 2D Gaussian Splatting representation as the core foundation, serving as a unified backbone for subsequent modules. To handle potential sparse regions in complex scenes, we propose a structured dense enhancement scheme that utilizes SfM priors and a pointmap model to achieve a denser initialization, while incorporating a sparsity compensation mechanism to improve reconstruction completeness. Furthermore, we design a progressive hybrid geometric optimization strategy that organically integrates monocular and multi-view optimization to achieve efficient and accurate geometric refinement. Finally, to address the appearance inconsistency commonly observed in large-scale scenes, we introduce a depth-guided appearance modeling approach that learns spatial features with 3D consistency, facilitating effective decoupling between geometry and appearance and further enhancing reconstruction stability. Experiments on large-scale urban datasets demonstrate that MetroGS achieves superior geometric accuracy, rendering quality, offering a unified solution for high-fidelity large-scale scene reconstruction.

Overview
Illustration of the superiority of our method. (a) Our method accurately reconstructs the geometric structure of large-scale urban scenes, faithfully restoring fine details such as buildings, vegetation, and roads. (b) Compared with the SOTA method CityGSV2, our result are more complete and geometrically precise. (c) Benefiting from a well-designed training framework, our method achieves superior convergence speed and geometric quality. On four RTX 3090 GPUs, our method reaches better performance with less than 25% of the training time required by CityGSV2.
Qualitative Analysis
GauU-Scene
Comparison with other advanced reconstruction methods on the GauU-Scene dataset. P and R indicate the Precision and Recall with respect to the ground-truth point cloud. "NaN" means no results due to NaN error. "FAIL" means fail to extract meaningful mesh. MetroGS achieves state-of-the-art performance across all metrics.
MatrixCity
Left: Overall comparison with CityGSV2 on the GauU-Scene dataset. Right: Comparison with other methods on the MatrixCity dataset. MetroGS demonstrates excellent training efficiency while achieving strong performance on synthetic scenes.
Rendering & Depth
Depth
Qualitative results on the GauU-Scene dataset. MetroGS produces more accurate and complete depth, with improved robustness to appearance inconsistencies and fewer artifacts.
Meshing
Mesh
Compared with other state-of-the-art approaches, MetroGS converges more rapidly and produces higher-quality mesh results overall.

BibTeX

@article{chen2025metrogs,
  title={MetroGS: Efficient and Stable Reconstruction of Geometrically Accurate High-Fidelity Large-Scale Scenes},
  author={Chen, Kehua and Mao, Tianlu and Ma, Zhuxin and Jiang, Hao and Li, Zehao and Liu, Zihan and Gao, Shuqi and Zhao, Honglong and Dai, Feng and Zhang, Yucheng and others},
  journal={arXiv preprint arXiv:2511.19172},
  year={2025}
}