Qr code
News Official network 中文
Chen Guanzhou


Main positions:副研究员
Gender:Male
Status:Employed
School/Department:测绘遥感信息工程全国重点实验室
  • Discipline: Photogrammetry and Remote Sensing
  • Click: times

    Open Time:..

    The Last Update Time:..

    Current position: Home >> Scientific Research >> Paper Publications

    MSSDF: Modality-shared self-supervised distillation for high-resolution multi-modal remote sensing image learning

    Hits : Praise

    Impact Factor:15.5

    DOI number:10.1016/j.inffus.2025.104006

    Journal:Information Fusion

    Abstract:High-resolution multi-modal remote sensing (RS) images provide rich complementary information for Earth observation, yet the scarcity of high-quality annotated data remains a major obstacle for effective model training. To address this challenge, we propose a Modality-Shared Self-supervised Distillation Framework (MSSDF) that learns discriminative multi-modal representations with minimal reliance on labeled data. Specifically, MSSDF integrates information-aware and cross-modal masking strategies with multi-objective self-supervised learning, enabling the model to capture modality-shared semantics and compensate for missing or weakly labeled modalities. This design substantially reduces the dependence on large-scale annotations and enhances robustness under limited-label regimes. Extensive experiments on scene classification, semantic segmentation, and change detection tasks demonstrate that MSSDF consistently outperforms state-of-the-art methods, particularly when labeled data are scarce. Specifically, on the Potsdam and Vaihingen semantic segmentation tasks, our method achieved mIoU scores of 78.30 % and 76.50 %, with only 50 % train-set. For the US3D depth estimation task, the RMSE error is reduced to 0.182, and for the binary change detection task in SECOND dataset, our method achieved mIoU scores of 47.51 %, surpassing the second by 3 percentage points. In addition, we construct a high-resolution multi-modal remote sensing image dataset named HR-Pairs, which contains 640,000 DOM (Digital Orthophoto Map) -DSM(Digital Surface Model) pairs with a spatial resolution of 0.05 m, providing a new high-quality dataset for multi-modal remote sensing research. Our pretrain code, checkpoints, and HR-Pairs dataset can be found in https://github.com/CVEO/MSSDF.

    Co-author:Chenxi Liu, Jiaqi Wang, Xiaoliang Tan, Wenchao Guo, Qingyuan Yang, Kaiqi Zhang

    Indexed by:Journal paper

    Correspondence Author:Guanzhou Chen, Xiaodong Zhang

    Document Type:J

    Volume:129

    Page Number:104006

    ISSN No.:1566-2535

    Translation or Not:no

    Date of Publication:2026-05-01