TY - JOUR
T1 - An innovative approach for predicting groundwater TDS using optimized ensemble machine learning algorithms at two levels of modeling strategy
AU - Elzain, Hussam Eldin
AU - Abdalla, Osman
AU - A. Ahmed, Hamdi
AU - Kacimov, Anvar
AU - Al-Maktoumi, Ali
AU - Al-Higgi, Khalifa
AU - Abdallah, Mohammed
AU - Yassin, Mohamed A.
AU - Senapathi, Venkatramanan
N1 - Copyright © 2023 Elsevier Ltd. All rights reserved.
PY - 2024/2/1
Y1 - 2024/2/1
N2 - Groundwater salinization in coastal aquifers is a major socioeconomic challenge in Oman and many other regions worldwide due to several anthropogenic activities and natural drivers. Therefore, assessing the salinization of groundwater resources is crucial to ensure the protection of water resources and sustainable management. The aim of this study is to apply a novel approach using predictive optimized ensemble trees-based (ETB) machine learning models, namely Catboost regression (CBR), Extra trees regression (ETR), and Bagging regression (BA), at two levels of modeling strategy for predicting groundwater TDS as an indicator for seawater intrusion in a coastal aquifer, Oman. At level 1, ETR and CBR models were used as base models or inputs for BA in level 2. The results show that the models at level 1 (i.e., ETR and CBR) yielded satisfactory results using a limited number of inputs (Cl, K, and Sr) from a few sets of 40 groundwater wells. The BA model at level 2 improved the overall performance of the modeling by extracting more information from ETR and CBR models at level 1 models. At level 2, the BA model achieved a significant improvement in accuracy (MSE = 0.0002, RSR = 0.062, R2 = 0.995 and NSE = 0.996) compared to each individual model of ETR (MSE = 0.0007, RSR = 0.245, R2 = 0.98 and NSE = 0.94), and CBR (MSE = 0.0035, RSR = 0.258, R2 = 0.933 and NSE = 0.934) at level 1 models in the testing dataset. BA model at level 2 outperformed all models regarding predictive accuracy, best generalization of new data, and matching the locations of the polluted and unpolluted wells. Our approach predicts groundwater TDS with high accuracy and thus provides early warnings of water quality deterioration along coastal aquifers which will improve water resources sustainability.
AB - Groundwater salinization in coastal aquifers is a major socioeconomic challenge in Oman and many other regions worldwide due to several anthropogenic activities and natural drivers. Therefore, assessing the salinization of groundwater resources is crucial to ensure the protection of water resources and sustainable management. The aim of this study is to apply a novel approach using predictive optimized ensemble trees-based (ETB) machine learning models, namely Catboost regression (CBR), Extra trees regression (ETR), and Bagging regression (BA), at two levels of modeling strategy for predicting groundwater TDS as an indicator for seawater intrusion in a coastal aquifer, Oman. At level 1, ETR and CBR models were used as base models or inputs for BA in level 2. The results show that the models at level 1 (i.e., ETR and CBR) yielded satisfactory results using a limited number of inputs (Cl, K, and Sr) from a few sets of 40 groundwater wells. The BA model at level 2 improved the overall performance of the modeling by extracting more information from ETR and CBR models at level 1 models. At level 2, the BA model achieved a significant improvement in accuracy (MSE = 0.0002, RSR = 0.062, R2 = 0.995 and NSE = 0.996) compared to each individual model of ETR (MSE = 0.0007, RSR = 0.245, R2 = 0.98 and NSE = 0.94), and CBR (MSE = 0.0035, RSR = 0.258, R2 = 0.933 and NSE = 0.934) at level 1 models in the testing dataset. BA model at level 2 outperformed all models regarding predictive accuracy, best generalization of new data, and matching the locations of the polluted and unpolluted wells. Our approach predicts groundwater TDS with high accuracy and thus provides early warnings of water quality deterioration along coastal aquifers which will improve water resources sustainability.
KW - Coastal aquifer
KW - Ensemble trees models
KW - Groundwater
KW - Machine learning
KW - Modeling at two levels
KW - TDS
KW - Environmental Monitoring/methods
KW - Water Pollutants, Chemical/analysis
KW - Water Resources
KW - Salinity
KW - Seawater
UR - http://www.scopus.com/inward/record.url?scp=85181808370&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85181808370&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/bf89ee85-0a0e-320e-bcce-4f291ece9128/
U2 - 10.1016/j.jenvman.2023.119896
DO - 10.1016/j.jenvman.2023.119896
M3 - Article
C2 - 38171121
AN - SCOPUS:85181808370
SN - 0301-4797
VL - 351
JO - Journal of Environmental Management
JF - Journal of Environmental Management
M1 - 119896
ER -