Please use this identifier to cite or link to this item: https://elib.vku.udn.vn/handle/123456789/3956
Title: A combination of feature selection and data sampling techniques for software fault prediction
Authors: Ha, Thi Minh Phuong
Nguyen, Thanh Long
Nguyen, Thanh Binh
Keywords: Software fault prediction
Feature selection
Data sampling
Promise
Issue Date: Sep-2023
Publisher: Publishing House for Science and Technology
Abstract: Software fault prediction (SFP) is the process of building models to predict faults in the early stage of software development. Prediction of software fault-prone modules can help developers allocate testing efforts more effectively and optimize maintenance cost. However, the performance of SFP models is influenced by the quality of software fault datasets. The irrelevant and redundant features of datasets may lead to negative impacts on the speed and accuracy of the trained models. Additionally, the presence of data imbalance that the number of faulty modules is significantly less than the number of non-faulty modules is the challenge in SFP. The study has applied 3 Generative adversarial networks (GAN) models including VanillaGAN, CTGAN and WGANGP along with 4 feature selection ranking methods including Chi-Squared, Information Gain, Fisher and Relief on four software fault datasets. The comparative analysis is performed by using 4 different classifiers to predict software faults. We have considered precision, recall, F1-score and Area Under the ROC (receiver operating characteristic curve) Curve (AUC) as performance evaluation metrics. The experimental results reveal that combinations of CTGAN, VanillaGAN and feature selection approaches outperformed the SFP models without applying data sampling and feature selection methods. The combinational pair of CTGAN and Relief demonstrated the best performance than other combinations with the highest average precision, recall, F1-score and AUC values of 0.857, 0.873, 0.856 and 0.767, respectively on Extra Tree.
Description: Proceedings of the 16th National Scientific Conference on Fundamental and Applied It Research (FAIR-2023); pp: 258-265.
URI: http://vap.ac.vn/Portals/0/TuyenTap/2024/2/21/64e13532907845ed9f5a2547dfec276f/33B_FAIR2023_paper_6739.pdf
https://elib.vku.udn.vn/handle/123456789/3956
ISBN: 978-604-357-201-8
Appears in Collections:NĂM 2023

Files in This Item:

 Sign in to read



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.