Abstract
This study presents an explainable machine learning framework to forecast groundwater storage dynamics, quantified as the Lake Water Equivalent (LWE), in the Urmia Lake Basin from 2003 to 2023. Satellite-based observations (GRACE, GLDAS) and climatic variables were integrated to model LWE variability. An ensemble learning approach was employed, combining Ridge Regression and Random Forest enhanced through feature re-weighting based on XGBoost-derived importance scores. Model interpretability was addressed using SHapley Additive exPlanations (SHAP), offering transparent insights into the contributions of climatic drivers. Results demonstrated that the Random Forest model achieved superior performance (RMSE = 3.27; (Formula presented.) = 0.89), with SHAP analysis highlighting the dominant influence of recent LWE values, temperature, and soil moisture. The proposed framework outperformed baseline models including Persistence, Standard Ridge Regression, and XGBoost in terms of both accuracy and explainability. The objectives of this study are (i) to forecast the LWE in the Urmia Lake Basin using an ensemble-based machine learning framework, (ii) to enhance predictive modeling through XGBoost-guided feature weighting, and (iii) to improve model transparency and interpretation using SHAP-based explainability techniques. By integrating ensemble learning with explainable AI, this work advances the transparent data-driven forecasting essential for sustainable groundwater management under climatic uncertainty.
| Original language | English |
|---|---|
| Article number | 1431 |
| Journal | Water (Switzerland) |
| Volume | 17 |
| Issue number | 10 |
| DOIs | |
| Publication status | Published - May 2025 |
Bibliographical note
Publisher Copyright:© 2025 by the authors.
Keywords
- Mann–Kendall test
- Urmia Lake
- groundwater storage
- machine learning
- remote sensing
- time series analysis