Application of the group method of data handling and variable importance analysis for prediction and modelling of saltwater intrusion processes in coastal aquifers

Journal Publication ResearchOnline@JCU
Lal, Alvin;Datta, Bithin
Abstract

Data-driven mathematical models are powerful prediction tools, which are utilized to approximate solution responses obtained using numerical saltwater intrusion simulation models. Employing data-driven prediction models as a replacement of the complex groundwater flow and transport models enables prediction of future scenarios. Most important, it also helps save computational time, effort and requirements when developing optimal coastal aquifer management methodologies using complex and large-scale coupled simulation-optimization models. In this study, a new data-driven mathematical model, namely group method of data handling (GMDH)-based prediction models, is developed and utilized to predict salinity concentration in a coastal aquifer by mimicking the responses of a variable-density flow and solute transport numerical simulation model. For comparison and evaluation purpose, the prediction performances of GMDH models were compared with well-established support vector machine regression and genetic programming based models. In addition, one important characteristic of the GMDH models is explored and evaluated, i.e. the ability to identify a set of most influential input predictor variables (pumping rates) that had the most significant impact on the outcomes (salinity concentration at monitoring locations). To confirm variable importance, 3 tests are conducted in which new GMDH models are constructed using subsets of the original datasets. In TEST 1, new GMDH models are constructed using a set of most influential variables (consisting of pumping rates at selected locations) only. In TEST 2, a subset of 20 variables (10 most and least influential variables) is used to develop new GMDH models. In TEST 3, a subset of the least influential variables is used to develop GMDH models. The performance evaluation results demonstrate that GMDH models developed using the entire dataset had reasonable prediction accuracy and efficiency. The comparison performance evaluation results for the three test scenarios highlighted the importance of the appropriate selection of relevant input pumping rates when developing accurate prediction models. The results suggested that incorporating the least influential variables deteriorate the accuracy of the prediction models; thus, considering the most influential pumping rates it is possible to develop more accurate and efficient salinity prediction models. Overall, the evaluation results from this study establish that the GMDH models and the inherent input variable ranking capability can be utilized as accurate and efficient coastal saltwater intrusion prediction models. Hence, GMDH models are viable saltwater intrusion modelling tools, which can be employed in future regional-scale saltwater intrusion prediction and management investigations.

Journal

Neural Computing and Applications

Publication Name

N/A

Volume

33

ISBN/ISSN

1433-3058

Edition

N/A

Issue

N/A

Pages Count

12

Location

N/A

Publisher

Springer

Publisher Url

N/A

Publisher Location

N/A

Publish Date

N/A

Url

N/A

Date

N/A

EISSN

N/A

DOI

10.1007/s00521-020-05232-8