Abstract
Floods are devastating natural disasters that cause significant fatalities, property damage, and economic challenges, often exacerbated by heavy rainfall, population growth, urban development, and climate change. Recently, machine learning (ML) has emerged as an effective tool for developing accurate flood maps. This research endeavors to perform a complex task regarding the investigation of diverse pre-processing schemes for flood susceptibility within the San Joaquin River Basin in California, US, by utilizing the eXtreme Gradient Boosting (XGBoost) algorithm as a predictive model. The entire methodology was designed to consider 22 flood conditioning factors identified within the studied area. Within a two-stage pre-processing examination involving a total of 18 scenarios, this study found that the XGBoost model utilizing robust scaling with 70/30 train-test splitting rationale achieved highest performance (AUROC of 0.851) and the 10 × class imbalance ratio subjected to the random under sampling (RUS) yielded the most accurate outcomes (AUROC of 0.835) with regard to the testing sets during the first and second stages, respectively. The flood susceptibility mapping results indicated that more than 20% of the basin is classified as being at high and very high flood hazard. The SHapley Additive exPlanation further emphasized the pivotal role of the specific presence of alluvium in the region as well as the distance to faults and roads, elevation factors in determining the flood susceptibility. Collectively, these findings will contribute to the existing literature on flood susceptibility mapping and inform the necessary precautions to be taken prior to the occurrence of flood events in the region.
| Original language | English |
|---|---|
| Article number | 199 |
| Journal | Natural Hazards |
| Volume | 122 |
| Issue number | 5 |
| DOIs | |
| Publication status | Published - Mar 2026 |
Bibliographical note
Publisher Copyright:© The Author(s) 2026.
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 8 Decent Work and Economic Growth
-
SDG 11 Sustainable Cities and Communities
-
SDG 13 Climate Action
Keywords
- Artificial intelligence
- Floods
- Model interpretability
- Pre-processing
- Resampling
- Susceptibility
Fingerprint
Dive into the research topics of 'Unveiling the performance of pre-processing approaches in machine learning based flood susceptibility mapping'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver