Abstract
Recently developed fine-grained JIT-SDP models separately predict whether a changed file in a commit will cause a defect in the future or not (in other words, defect-inducingness), in contrast to traditional JIT-SDP models that only predict commits. Fine-grained JIT-SDP models also cost-effectively reduce the risk of overlooking defect-inducing changes in effort-aware JIT-SDP models by allowing developers to review only defect-inducing changed files in a commit. But the fact is that building machine learning models is a data-dependent process, so the quality of the data is crucial. Low data quality negatively affects the predictive performance, interpretability, and scalability of machine learning models. In the context of JIT-SDP, there is no study in the literature that directly focuses on data quality. In this light of information, we proposed a novel data quality improvement method for fine-grained JIT-SDP models considering software domain. We then demonstrated that our data quality improvement method increases predictive performance for within-project and cross-project fine-grained JIT-SDP models. In doing so, we open the door to JIT-SDP models that have good predictive performance, cost-effectiveness, and a low probability of overlooking project components that cause defects.
Original language | English |
---|---|
Title of host publication | 14th International Workshop on Computer Science and Engineering, WCSE 2024 |
Publisher | International Workshop on Computer Science and Engineering (WCSE) |
Pages | 319-324 |
Number of pages | 6 |
ISBN (Electronic) | 9789819411566 |
DOIs | |
Publication status | Published - 2024 |
Event | 14th International Workshop on Computer Science and Engineering, WCSE 2024 - Phuket Island, Thailand Duration: 19 Jun 2024 → 21 Jun 2024 |
Publication series
Name | 14th International Workshop on Computer Science and Engineering, WCSE 2024 |
---|
Conference
Conference | 14th International Workshop on Computer Science and Engineering, WCSE 2024 |
---|---|
Country/Territory | Thailand |
City | Phuket Island |
Period | 19/06/24 → 21/06/24 |
Bibliographical note
Publisher Copyright:© 2024 14th International Workshop on Computer Science and Engineering, WCSE 2024. All rights reserved.
Keywords
- dataset quality
- just-in-time software defect prediction
- software metrics