Abstract
The predictive performance of machine learning (ML) models is challenged when trained on class imbalance real-world construction datasets, reducing the accuracy of relevant decisions. In construction projects, the collection of a balanced dataset is not always feasible. Here, the integration of generative and prediction models holds potential, synthesizing the underrepresented class and configuring a balanced input dataset. This study improves the performance of construction prediction models through the integration of a generative model that augments the dataset for the underrepresented class. For this, a variational autoencoder (VAE) was integrated into a multi-head graph attention network (GAT), whereby a comprehensive construction productivity dataset was collected across different projects related to different construction activities, each with a particular structure and level of class imbalance. Balancing the class distribution led to a significant increase in the predictive performance of the GAT model, where accuracy jumped from 90.6 % to 92.5 %, 81.1 % to 94.4 %, and 92.2 % to 95.4 % when trained on finishing, concrete, and insulation activity networks, respectively.
| Original language | English |
|---|---|
| Article number | 102606 |
| Journal | Advanced Engineering Informatics |
| Volume | 62 |
| DOIs | |
| Publication status | Published - Oct 2024 |
Bibliographical note
Publisher Copyright:© 2024 Elsevier Ltd
Keywords
- Class imbalance
- Construction productivity prediction
- Data augmentation
- Generative model
- Graph attention network (GAT)
- Machine learning (ML)
- Variational autoencoder (VAE)