Skip to main navigation Skip to search Skip to main content

Unveiling the performance of pre-processing approaches in machine learning based flood susceptibility mapping

  • Gebze Technical University

Research output: Contribution to journalArticlepeer-review

Abstract

Floods are devastating natural disasters that cause significant fatalities, property damage, and economic challenges, often exacerbated by heavy rainfall, population growth, urban development, and climate change. Recently, machine learning (ML) has emerged as an effective tool for developing accurate flood maps. This research endeavors to perform a complex task regarding the investigation of diverse pre-processing schemes for flood susceptibility within the San Joaquin River Basin in California, US, by utilizing the eXtreme Gradient Boosting (XGBoost) algorithm as a predictive model. The entire methodology was designed to consider 22 flood conditioning factors identified within the studied area. Within a two-stage pre-processing examination involving a total of 18 scenarios, this study found that the XGBoost model utilizing robust scaling with 70/30 train-test splitting rationale achieved highest performance (AUROC of 0.851) and the 10 × class imbalance ratio subjected to the random under sampling (RUS) yielded the most accurate outcomes (AUROC of 0.835) with regard to the testing sets during the first and second stages, respectively. The flood susceptibility mapping results indicated that more than 20% of the basin is classified as being at high and very high flood hazard. The SHapley Additive exPlanation further emphasized the pivotal role of the specific presence of alluvium in the region as well as the distance to faults and roads, elevation factors in determining the flood susceptibility. Collectively, these findings will contribute to the existing literature on flood susceptibility mapping and inform the necessary precautions to be taken prior to the occurrence of flood events in the region.

Original languageEnglish
Article number199
JournalNatural Hazards
Volume122
Issue number5
DOIs
Publication statusPublished - Mar 2026

Bibliographical note

Publisher Copyright:
© The Author(s) 2026.

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 8 - Decent Work and Economic Growth
    SDG 8 Decent Work and Economic Growth
  2. SDG 11 - Sustainable Cities and Communities
    SDG 11 Sustainable Cities and Communities
  3. SDG 13 - Climate Action
    SDG 13 Climate Action

Keywords

  • Artificial intelligence
  • Floods
  • Model interpretability
  • Pre-processing
  • Resampling
  • Susceptibility

Fingerprint

Dive into the research topics of 'Unveiling the performance of pre-processing approaches in machine learning based flood susceptibility mapping'. Together they form a unique fingerprint.

Cite this