Integrating feature engineering, genetic algorithm and tree-based machine learning methods to predict the post-accident disability status of construction workers

Kerim Koc*, Ömer Ekmekcioğlu, Asli Pelin Gurgun

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

70 Citations (Scopus)

Abstract

The construction industry is among the riskiest industries around the world. Hence, the preliminary studies exploring the consequences of occupational accidents have received considerable attention in research society. This study aims to develop a comprehensive framework to predict the post-accident disability status of construction workers. The dataset comprising 47,938 construction accidents recorded in Turkey was subjected to a detailed multi-step feature engineering approach, including data encoding, data scaling, dimension reduction, and data resampling. Predictions were performed through four tree-based ensemble machine learning models: Random Forest, XGBoost, AdaBoost, and Extra Trees, as well as a state-of-the-art optimization method for hyperparameter tuning, Genetic Algorithm (GA). GA-XGBoost presented the highest prediction rate with 0.8292 in terms of accuracy and 0.8120 with respect to AUROC. The findings may aid in predicting construction workers' post-accident disability status, resulting in a safer working environment and productivity planning in construction projects.

Original languageEnglish
Article number103896
JournalAutomation in Construction
Volume131
DOIs
Publication statusPublished - Nov 2021

Bibliographical note

Publisher Copyright:
© 2021 Elsevier B.V.

Funding

The authors would like to thank the Republic of Turkey, Social Security Institution for their support and providing the dataset. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Keywords

  • Artificial intelligence
  • Construction safety
  • Genetic algorithm
  • Machine learning
  • Occupational accident
  • Safety management
  • Tree-based ensemble models
  • Worker disability

Fingerprint

Dive into the research topics of 'Integrating feature engineering, genetic algorithm and tree-based machine learning methods to predict the post-accident disability status of construction workers'. Together they form a unique fingerprint.

Cite this