Dataset Quality Improvement for Fine-Grained Just-in-Time Software Defect Prediction

Irem Fidandan, Feza Buzluca*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Recently developed fine-grained JIT-SDP models separately predict whether a changed file in a commit will cause a defect in the future or not (in other words, defect-inducingness), in contrast to traditional JIT-SDP models that only predict commits. Fine-grained JIT-SDP models also cost-effectively reduce the risk of overlooking defect-inducing changes in effort-aware JIT-SDP models by allowing developers to review only defect-inducing changed files in a commit. But the fact is that building machine learning models is a data-dependent process, so the quality of the data is crucial. Low data quality negatively affects the predictive performance, interpretability, and scalability of machine learning models. In the context of JIT-SDP, there is no study in the literature that directly focuses on data quality. In this light of information, we proposed a novel data quality improvement method for fine-grained JIT-SDP models considering software domain. We then demonstrated that our data quality improvement method increases predictive performance for within-project and cross-project fine-grained JIT-SDP models. In doing so, we open the door to JIT-SDP models that have good predictive performance, cost-effectiveness, and a low probability of overlooking project components that cause defects.

Original languageEnglish
Title of host publication14th International Workshop on Computer Science and Engineering, WCSE 2024
PublisherInternational Workshop on Computer Science and Engineering (WCSE)
Pages319-324
Number of pages6
ISBN (Electronic)9789819411566
DOIs
Publication statusPublished - 2024
Event14th International Workshop on Computer Science and Engineering, WCSE 2024 - Phuket Island, Thailand
Duration: 19 Jun 202421 Jun 2024

Publication series

Name14th International Workshop on Computer Science and Engineering, WCSE 2024

Conference

Conference14th International Workshop on Computer Science and Engineering, WCSE 2024
Country/TerritoryThailand
CityPhuket Island
Period19/06/2421/06/24

Bibliographical note

Publisher Copyright:
© 2024 14th International Workshop on Computer Science and Engineering, WCSE 2024. All rights reserved.

Keywords

  • dataset quality
  • just-in-time software defect prediction
  • software metrics

Fingerprint

Dive into the research topics of 'Dataset Quality Improvement for Fine-Grained Just-in-Time Software Defect Prediction'. Together they form a unique fingerprint.

Cite this