Enhancing Just-in-time Defect Prediction Using Change Request-based Metrics

Computer Engineering Project Topics

Get the Complete Project Materials Now! ยป

Identifying defective software components as early as their commit helps to reduce signi cant softwarerndevelopment and maintenance costs. In recent years, several studies propose to use just-in-time (JIT)rndefect prediction techniques to identify changes that could introduce defects at check-in time. Tornpredict defect introducing changes, JIT defect prediction approaches use change metrics collectedrnfrom software repositories. These change metrics, however, capture code and code change relatedrninformation. Information related to the change requests (e.g., clarity of change request and di cultyrnto implement the change) that could determine the change's proneness to introducing new defects arernnot studied. In this study, we propose to augment the publicly available change metrics dataset withrnsix change request-based metrics collected from issue tracking systems. To build the prediction model,rnwe used ve machine learning algorithms: AdaBoost, XGBoost, Deep Neural Network, RandomrnForest and Logistic Regression. The proposed approach is evaluated using a dataset collected fromrnfour open source software systems, i.e., Eclipse platform, Eclipse JDT, Bugzilla and Mozilla. Thernresults show that the augmented dataset improves the performance of JIT defect prediction in 19rnout of 20 cases. F1-score of JIT defect prediction in the four systems is improved by an average ofrn4.8%, 3.4%, 1.7%, 1.1% and 1.1% while using AdaBoost, XGBoost, Deep Neural Network, RandomrnForest and Logistic Regression, respectively. Finally, among the ve algorithms used for building thernmachine learning models, AdaBoost is found to be better algorithm for enhancing the performancernof JIT defect prediction. To see which of the features contributed to the improvement of JIT defectrnprediction, we computed feature importance using the best performing algorithm, AdaBoost. Thernresult shows that number of comments (NC), Severity and number of developers assigned (NDA) arernamong the top important features from the entire augmented dataset.

Get Full Work

Report copyright infringement or plagiarism

Be the First to Share On Social



1GB data
1GB data

RELATED TOPICS

1GB data
1GB data
Enhancing Just-in-time Defect Prediction Using Change Request-based Metrics

173