Importance of data preprocessing for neural networks modeling: The case of estimating the compaction parameters of soils

Fatih Isik*, Gurkan Ozden, Mehmet Kuntalp

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

In recent years, the artificial neural networks (ANNs) have been successfully applied to variety of engineering problems in order to discover the unknown phenomenon of the problem at hand. In the majority of these applications, ANNs were used to predict the non-linear relationship between the input variables and the corresponding target(s). Although, ANNs have undeniable advantages, they are not faultless. One of the shortcomings of ANNs takes place at the preprocessing stage of the modeling. The data preprocessing methodologies (i.e. data transformation and data division) have a significant effect on the performance of ANN models. This study examines the effect of four different data transformation methods (i.e. statistical normalization, min-max normalization, non-linear transformation and whitening transformation) and two different data division methods (i.e. random division and fuzzy c-means clustering) on ANN prediction models performances for the case study of prediction of the compaction parameters of both coarse and fine-grained soils at standard Proctor compaction energy level. Findings reveal that the raw data should be transformed by a data transformation method. It is also exposed that the main data set should be subjected to clustering analysis and divided into training, testing and validation subsets by a systematic approach. The success of preprocessing methods may vary for other neural network applications. However, this study shows the importance of data preprocessing neural networks modelers.

Original languageEnglish
Pages (from-to)871-882
Number of pages12
JournalEnergy Education Science and Technology Part A: Energy Science and Research
Volume29
Issue number2
Publication statusPublished - Jul 2012
Externally publishedYes

Keywords

  • Artificial neural networks
  • Clustering analysis
  • Data division
  • Data transformation
  • Fuzzy c-means clustering

Fingerprint

Dive into the research topics of 'Importance of data preprocessing for neural networks modeling: The case of estimating the compaction parameters of soils'. Together they form a unique fingerprint.

Cite this