Generating synthetic data with variational autoencoder to address class imbalance of graph attention network prediction model for construction management

Fatemeh Mostofi, Onur Behzat Tokdemir, Vedat Toğan*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

The predictive performance of machine learning (ML) models is challenged when trained on class imbalance real-world construction datasets, reducing the accuracy of relevant decisions. In construction projects, the collection of a balanced dataset is not always feasible. Here, the integration of generative and prediction models holds potential, synthesizing the underrepresented class and configuring a balanced input dataset. This study improves the performance of construction prediction models through the integration of a generative model that augments the dataset for the underrepresented class. For this, a variational autoencoder (VAE) was integrated into a multi-head graph attention network (GAT), whereby a comprehensive construction productivity dataset was collected across different projects related to different construction activities, each with a particular structure and level of class imbalance. Balancing the class distribution led to a significant increase in the predictive performance of the GAT model, where accuracy jumped from 90.6 % to 92.5 %, 81.1 % to 94.4 %, and 92.2 % to 95.4 % when trained on finishing, concrete, and insulation activity networks, respectively.

Original languageEnglish
Article number102606
JournalAdvanced Engineering Informatics
Volume62
DOIs
Publication statusPublished - Oct 2024

Bibliographical note

Publisher Copyright:
© 2024 Elsevier Ltd

Keywords

  • Class imbalance
  • Construction productivity prediction
  • Data augmentation
  • Generative model
  • Graph attention network (GAT)
  • Machine learning (ML)
  • Variational autoencoder (VAE)

Fingerprint

Dive into the research topics of 'Generating synthetic data with variational autoencoder to address class imbalance of graph attention network prediction model for construction management'. Together they form a unique fingerprint.

Cite this