Bio


My research interests are Earth Vision and AI4Earth, especially multi-modal and multi-temporal remote sensing image analysis and their real-world applications.

First-author representative works:
- Our Change family: ChangeStar (single-temporal learning, ICCV 2021), ChangeMask (many-to-many architecture, ISPRS P&RS 2022), ChangeOS (one-to-many architecture, RSE 2021), Changen (generative change modeling, ICCV 2023)
- Geospatial object segmentation: FarSeg (CVPR 2020) and FarSeg++ (TPAMI 2023), LoveDA dataset (NeurIPS Datasets and Benchmark 2021)
- Missing-modality all weather mapping: Deep Multisensory Learning (first work on this topic, ISPRS P&RS 2021)
- Hyperspectral image classification: FPGA (first fully end-to-end patch-free method for HSI, TGRS 2020)

Stanford Advisors


All Publications


  • Single-Temporal Supervised Learning for Universal Remote Sensing Change Detection INTERNATIONAL JOURNAL OF COMPUTER VISION Zheng, Z., Zhong, Y., Ma, A., Zhang, L. 2024
  • FarSeg++: Foreground-Aware Relation Network for Geospatial Object Segmentation in High Spatial Resolution Remote Sensing Imagery. IEEE transactions on pattern analysis and machine intelligence Zheng, Z., Zhong, Y., Wang, J., Ma, A., Zhang, L. 2023; PP

    Abstract

    Geospatial object segmentation, a fundamental Earth vision task, always suffers from scale variation, the larger intra-class variance of background, and foreground-background imbalance in high spatial resolution (HSR) remote sensing imagery. Generic semantic segmentation methods mainly focus on the scale variation in natural scenarios. However, the other two problems are insufficiently considered in large area Earth observation scenarios. In this paper, we propose a foreground-aware relation network (FarSeg++) from the perspectives of relation-based, optimization-based, and objectness-based foreground modeling, alleviating the above two problems. From the perspective of the relations, the foreground-scene relation module improves the discrimination of the foreground features via the foreground-correlated contexts associated with the object-scene relation. From the perspective of optimization, foreground-aware optimization is proposed to focus on foreground examples and hard examples of the background during training to achieve a balanced optimization. Besides, from the perspective of objectness, a foreground-aware decoder is proposed to improve the objectness representation, alleviating the objectness prediction problem that is the main bottleneck revealed by an empirical upper bound analysis. We also introduce a new large-scale high-resolution urban vehicle segmentation dataset to verify the effectiveness of the proposed method and push the development of objectness prediction further forward. The experimental results suggest that FarSeg++ is superior to the state-of-the-art generic semantic segmentation methods and can achieve a better trade-off between speed and accuracy. The code and model are available at: https://github.com/Z-Zheng/FarSeg.

    View details for DOI 10.1109/TPAMI.2023.3296757

    View details for PubMedID 37467086

  • ChangeMask: Deep multi-task encoder-transformer-decoder architecture for semantic change detection ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING Zheng, Z., Zhong, Y., Tian, S., Ma, A., Zhang, L. 2022; 183: 228-239
  • Building damage assessment for rapid disaster response with a deep object-based semantic change detection framework: From natural disasters to man-made disasters REMOTE SENSING OF ENVIRONMENT Zheng, Z., Zhong, Y., Wang, J., Ma, A., Zhang, L. 2021; 265
  • Deep multisensor learning for missing-modality all-weather mapping ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING Zheng, Z., Ma, A., Zhang, L., Zhong, Y. 2021; 174: 254-264
  • FPGA: Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING Zheng, Z., Zhong, Y., Ma, A., Zhang, L. 2020; 58 (8): 5612-5626
  • HyNet: Hyper-scale object detection network framework for multiple spatial resolution remote sensing imagery ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING Zheng, Z., Zhong, Y., Ma, A., Han, X., Zhao, J., Liu, Y., Zhang, L. 2020; 166: 1-14
  • Global road extraction using a pseudo-label guided framework: from benchmark dataset to cross-region semi-supervised learning GEO-SPATIAL INFORMATION SCIENCE Lu, X., Zhong, Y., Zheng, Z., Wang, J., Chen, D., Su, Y. 2024
  • EarthVQANet: Multi-task visual question answering for remote sensing image understanding ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING Wang, J., Ma, A., Chen, Z., Zheng, Z., Wan, Y., Zhang, L., Zhong, Y. 2024; 212: 422-439
  • LoveNAS: Towards multi-scene land-cover mapping via hierarchical searching adaptive network ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING Wang, J., Zhong, Y., Ma, A., Zheng, Z., Wan, Y., Zhang, L. 2024; 209: 265-278
  • Temporal-agnostic change region proposal for semantic change detection ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING Tian, S., Tan, X., Ma, A., Zheng, Z., Zhang, L., Zhong, Y. 2023; 204: 306-320
  • Explicable Fine-Grained Aircraft Recognition Via Deep Part Parsing Prior Framework for High-Resolution Remote Sensing Imagery. IEEE transactions on cybernetics Chen, D., Zhong, Y., Ma, A., Zheng, Z., Zhang, L. 2023; PP

    Abstract

    Aircraft recognition is crucial in both civil and military fields, and high-spatial resolution remote sensing has emerged as a practical approach. However, existing data-driven methods fail to locate discriminative regions for effective feature extraction due to limited training data, leading to poor recognition performance. To address this issue, we propose a knowledge-driven deep learning method called the explicable aircraft recognition framework based on a part parsing prior (APPEAR). APPEAR explicitly models the aircraft's rigid structure as a pixel-level part parsing prior, dividing it into five parts: 1) the nose; 2) left wing; 3) right wing; 4) fuselage; and 5) tail. This fine-grained prior provides reliable part locations to delineate aircraft architecture and imposes spatial constraints among the parts, effectively reducing the search space for model optimization and identifying subtle interclass differences. A knowledge-driven aircraft part attention (KAPA) module uses this prior to achieving a geometric-invariant representation for identifying discriminative features. Part features are generated by part indexing in a specific order and sequentially embedded into a compact space to obtain a fixed-length representation for each part, invariant to aircraft orientation and scale. The part attention module then takes the embedded part features, adaptively reweights their importance to identify discriminative parts, and aggregates them for recognition. The proposed APPEAR framework is evaluated on two aircraft recognition datasets and achieves superior performance. Moreover, experiments with few-shot learning methods demonstrate the robustness of our framework in different tasks. Ablation analysis illustrates that the fuselage and wings of the aircraft are the most effective parts for recognition.

    View details for DOI 10.1109/TCYB.2023.3293033

    View details for PubMedID 37552595

  • Large-scale agricultural greenhouse extraction for remote sensing imagery based on layout attention network: A case study of China ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING Chen, D., Ma, A., Zheng, Z., Zhong, Y. 2023; 200: 73-88
  • Large-scale deep learning based binary and semantic change detection in ultra high resolution remote sensing imagery: From benchmark datasets to urban application ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING Tian, S., Zhong, Y., Zheng, Z., Ma, A., Tan, X., Zhang, L. 2022; 193: 164-186
  • Cross-sensor domain adaptation for high spatial resolution urban land-cover mapping: From airborne to spaceborne imagery REMOTE SENSING OF ENVIRONMENT Wang, J., Ma, A., Zhong, Y., Zheng, Z., Zhang, L. 2022; 277
  • GRE AND BEYOND: A GLOBAL ROAD EXTRACTION DATASET Lu, X., Zhong, Y., Zheng, Z., Chen, D., IEEE IEEE. 2022: 3035-3038
  • Cascaded Multi-Task Road Extraction Network for Road Surface, Centerline, and Edge Extraction IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING Lu, X., Zhong, Y., Zheng, Z., Chen, D., Su, Y., Ma, A., Zhang, L. 2022; 60
  • A Supervised Progressive Growing Generative Adversarial Network for Remote Sensing Image Scene Classification IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING Ma, A., Yu, N., Zheng, Z., Zhong, Y., Zhang, L. 2022; 60
  • National-scale greenhouse mapping for high spatial resolution remote sensing imagery using a dense object dual-task deep learning framework: A case study of China ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING Ma, A., Chen, D., Zhong, Y., Zheng, Z., Zhang, L. 2021; 181: 279-294
  • Cross-domain road detection based on global-local adversarial learning framework from very high resolution satellite imagery ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING Lu, X., Zhong, Y., Zheng, Z., Wang, J. 2021; 180: 296-312
  • FactSeg: Foreground Activation-Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING Ma, A., Wang, J., Zhong, Y., Zheng, Z. 2022; 60
  • Urban road mapping based on an end-to-end road vectorization mapping network framework ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING Chen, D., Zhong, Y., Zheng, Z., Ma, A., Lu, X. 2021; 178: 345-365
  • A Spectral-Spatial-Dependent Global Learning Framework for Insufficient and Imbalanced Hyperspectral Image Classification IEEE TRANSACTIONS ON CYBERNETICS Zhu, Q., Deng, W., Zheng, Z., Zhong, Y., Guan, Q., Lin, W., Zhang, L., Li, D. 2022; 52 (11): 11709-11723

    Abstract

    Deep learning techniques have been widely applied to hyperspectral image (HSI) classification and have achieved great success. However, the deep neural network model has a large parameter space and requires a large number of labeled data. Deep learning methods for HSI classification usually follow a patchwise learning framework. Recently, a fast patch-free global learning (FPGA) architecture was proposed for HSI classification according to global spatial context information. However, FPGA has difficulty in extracting the most discriminative features when the sample data are imbalanced. In this article, a spectral-spatial-dependent global learning (SSDGL) framework based on the global convolutional long short-term memory (GCL) and global joint attention mechanism (GJAM) is proposed for insufficient and imbalanced HSI classification. In SSDGL, the hierarchically balanced (H-B) sampling strategy and the weighted softmax loss are proposed to address the imbalanced sample problem. To effectively distinguish similar spectral characteristics of land cover types, the GCL module is introduced to extract the long short-term dependency of spectral features. To learn the most discriminative feature representations, the GJAM module is proposed to extract attention areas. The experimental results obtained with three public HSI datasets show that the SSDGL has powerful performance in insufficient and imbalanced sample problems and is superior to other state-of-the-art methods.

    View details for DOI 10.1109/TCYB.2021.3070577

    View details for Web of Science ID 000732244900001

    View details for PubMedID 34033562

  • GAMSNet: Globally aware road detection network with multi-scale residual learning ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING Lu, X., Zhong, Y., Zheng, Z., Zhang, L. 2021; 175: 340-352
  • RSNet: The Search for Remote Sensing Deep Neural Networks in Recognition Tasks IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING Wang, J., Zhong, Y., Zheng, Z., Ma, A., Zhang, L. 2021; 59 (3): 2520-2534
  • COLOR: Cycling, Offline Learning, and Online Representation Framework for Airport and Airplane Detection Using GF-2 Satellite Images IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING Zhong, Y., Zheng, Z., Ma, A., Lu, X., Zhang, L. 2020; 58 (12): 8438-8449
  • Edge-Reinforced Convolutional Neural Network for Road Detection in Very-High-Resolution Remote Sensing Imagery PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING Lu, X., Zhong, Y., Zheng, Z., Zhao, J., Zhang, L. 2020; 86 (3): 153-160
  • A NOVEL GLOBAL-AWARE DEEP NETWORK FOR ROAD DETECTION OF VERY HIGH RESOLUTION REMOTE SENSING IMAGERY Lu, X., Zhong, Y., Zheng, Z., IEEE IEEE. 2020: 2579-2582
  • Multi-Scale and Multi-Task Deep Learning Framework for Automatic Road Extraction IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING Lu, X., Zhong, Y., Zheng, Z., Liu, Y., Zhao, J., Ma, A., Yang, J. 2019; 57 (11): 9362-9377
  • S3NET: TOWARDS REAL-TIME HYPERSPECTRAL IMAGERY CLASSIFICATION Zheng, Z., Zhong, Y., IEEE IEEE. 2019: 3293-3296
  • POP-NET: ENCODER-DUAL DECODER FOR SEMANTIC SEGMENTATION AND SINGLE-VIEW HEIGHT ESTIMATION Zheng, Z., Zhong, Y., Wang, J., IEEE IEEE. 2019: 4963-4966
  • Deep Salient Feature Based Anti-Noise Transfer Network for Scene Classification of Remote Sensing Imagery REMOTE SENSING Gong, X., Xie, Z., Liu, Y., Shi, X., Zheng, Z. 2018; 10 (3)

    View details for DOI 10.3390/rs10030410

    View details for Web of Science ID 000428280100056

  • COLOR: CYCLING OFFLINE LEARNING AND ONLINE REPRESENTING FOR REMOTE SENSING DATAFLOW Zheng, Z., Zhong, Y., IEEE IEEE. 2018: 4093-4096
  • Multi-channel Pose-aware Convolution Neural Networks for Multi-view Facial Expression Recognition Liu, Y., Zeng, J., Shan, S., Zheng, Z., IEEE IEEE. 2018: 458-465