Abstract
Rapid urban growth and globalization affect land use in cities, and the need for automatic interpretation of remote sensing images is constantly increasing. Deep neural networks are becoming widespread in high-resolution aerial and satellite image sources in Earth observation missions. Various convolutional neural network (CNN) architectures have been implemented in building extraction, but it is still challenging to distinguish building class from other man-made classes in public datasets. Here, we present comparative research for automatic building extraction on different data sources using DeepLabV3+ architecture with ResNet-18, ResNet-50, Xception, and MobileNetv2 models. The CNNs are implemented on Inria Aerial Image Labeling, Massachusetts Buildings, and Wuhan University Building Extraction Datasets in terms of evaluation metrics and training and testing time consumption. Our implementation of the DeepLabV3 + ResNet-50 model performed F1-score of 97.44% in Massachusetts Building dataset and intersection over union as 80.75% in Inria dataset, higher than at least 3% than the previous studies.
Original language | English |
---|---|
Article number | 024510 |
Journal | Journal of Applied Remote Sensing |
Volume | 16 |
Issue number | 2 |
DOIs | |
Publication status | Published - 1 Apr 2022 |
Bibliographical note
Publisher Copyright:© 2022 Society of Photo-Optical Instrumentation Engineers (SPIE).
Keywords
- building extraction
- convolutional neural network
- land-use classification
- semantic segmentation