Abstract
Defect prediction studies have proposed several data-driven approaches, and recently, this field has put more emphasis on whether the people factor is associated software defects. Developer metrics can capture experience, code ownership, coding skills and techniques, and commit activities. These metrics have so far been measured at a specified snapshot of the codebase although developer's knowledge on a source module could change over time. In this paper, we propose to measure periodic developer experience with regard to contextual knowledge on files and directories. We extract periodic experience metrics capturing the previous activities of developers on source files and investigate the explanatory effect of these metrics on defects. We also use activity-based (churn) metrics to observe the performance of both metric types on defect prediction. We used two large-scale open source projects, Lucene and Jackrabbit, for model evaluation. We calculate periodic developer experience metrics and churn metrics at two granularity levels: File level and commit level. We build the models using five popular machine learning algorithms in defect prediction literature. The models with the two best performing algorithms are assessed in terms of Precision, Recall, False Positive Rate, and F-measure. The set of metrics that explains software defects the best is also identified using correlation-based feature selection method. Results show that periodic developer experience metrics extracted at file level are good merits for defect prediction, accompanied with churn. When there is not enough data to extract the contextual knowledge of developers on source files, churn metrics play an important role on defect prediction.
Original language | English |
---|---|
Title of host publication | Proceedings - 18th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2018 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 72-81 |
Number of pages | 10 |
ISBN (Electronic) | 9781538682906 |
DOIs | |
Publication status | Published - 9 Nov 2018 |
Event | 18th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2018 - Madrid, Spain Duration: 23 Sept 2018 → 24 Sept 2018 |
Publication series
Name | Proceedings - 18th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2018 |
---|
Conference
Conference | 18th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2018 |
---|---|
Country/Territory | Spain |
City | Madrid |
Period | 23/09/18 → 24/09/18 |
Bibliographical note
Publisher Copyright:© 2018 IEEE.
Keywords
- Churn metrics
- Code ownership
- Periodic developer experience
- Software defect prediction