Exploring the additional value of class imbalance distributions on interpretable flash flood susceptibility prediction in the Black Warrior River basin, Alabama, United States

Ömer Ekmekcioğlu*, Kerim Koc, Mehmet Özger, Zeynep Işık

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

33 Citations (Scopus)

Abstract

This study proposes a novel flash flood susceptibility prediction framework with a particular emphasis on the extent of imbalance between the number of flooding and non-flooding events as majority of the events result in non-flooding. The class imbalance issue and the magnitude of the imbalance was explored in this study to highlight the uncertain nature of the flooding phenomenon. Therefore, the Random Forest (RF) was initially adopted to evaluate five imbalance class distribution scenarios (i.e., 1x, 10x, 25x, 50x, 100x non-flood events, for each x flood event). Parameter configurations of developed models were determined with the state-of-the-art metaheuristic, the Cuckoo Search (CS) algorithm. The CS-RF model showed the highest (0.8455) prediction capability with regards to the area under the receiver operating characteristic (AUROC) once the extent of imbalance was set as 50x. The CS-RF model was then benchmarked with another bagging, i.e., Extra Trees, and two boosting, i.e., Adaptive Boosting (Adaboost) and eXtreme Gradient Boosting (XGBoost) algorithms, all integrated with the CS technique. Analysis results showed that the CS-RF is the most promising tree-based machine learning technique in flash flood susceptibility projection for the selected study area. According to the predictions, a flash flood susceptibility map was generated, where 9.35% of the basin was under very high flash flood risk. A recently developed model-agnostic game-theoretical method, SHapley Additive exPlanations (SHAP), was used for anatomizing the flash flood conditioning factors to highlight the contribution of each feature on the incident outcome prediction ensuring the transparency of the model findings. Overall, this study contributes to both theory and practice with particular focus on the model interpretability and existence of imbalance in the occurrence of flash flood events, assisting decision-makers in enhancing strategies to combat hazardous impacts of floods.

Original languageEnglish
Article number127877
JournalJournal of Hydrology
Volume610
DOIs
Publication statusPublished - Jul 2022

Bibliographical note

Publisher Copyright:
© 2022 Elsevier B.V.

Keywords

  • Artificial intelligence
  • Flash flood susceptibility
  • Flood risk management
  • Geographic information system (GIS)
  • Imbalance data
  • Machine learning
  • SHapley Additive exPlanations (SHAP)

Fingerprint

Dive into the research topics of 'Exploring the additional value of class imbalance distributions on interpretable flash flood susceptibility prediction in the Black Warrior River basin, Alabama, United States'. Together they form a unique fingerprint.

Cite this