ACO based comprehensive model for software fault prediction

Abstract

The comprehensive models can be used for software quality modelling which involves prediction of low-quality modules using interpretable rules. Such comprehensive model can guide the design and testing team to focus on the poor quality modules, thereby, limited resources allocated for software quality inspection can be targeted only towards modules that are likely to be defective. Ant Colony Optimization (ACO) based learner is one potential way to obtain rules that can classify the software modules faulty and not faulty. This paper investigates ACO based mining approach with ROC based rule quality updation to constructs a rule-based software fault prediction model with useful metrics. We have also investigated the effect of feature selection on ACO based and other benchmark algorithms. We tested the proposed method on several publicly available software fault data sets. We compared the performance of ACO based learning with the results of three benchmark classifiers on the basis of area under the receiver operating characteristic curve. The evaluation of performance measure proves that the ACO based learner outperforms other benchmark techniques.

Keywords

Software metric fault prediction ACO

1. Introduction

Software quality improvement method includes code inspections, design walkthroughs, prototype simulation, and measurement-based analysis. In order to improve the quality of high assurance software, prior identification of low-quality modules will provide enough time to design and testing team for focus on those modules. One of the tasks of the quality models is to identify the number of faulty modules in software also the risk category it belongs to, e.g., fault-prone (fp) or not-fault-prone (nfp). Software quality prediction is supportive for better exploitation of resources and to undeviating efforts to software modules with higher risk. Hence, it helps in reducing the cost by early assessment of faults also helps in the prior planning of testing the software modules that are likely to be fault-prone during operations. Machine learning approaches have been extensively used for software fault prediction (SFP) for quality modeling, such as Logistic Regression, Naive Bayes softcomputing [1], hybrid artificial neural network (ANN) and Quantum Particle Swarm Optimization (QPSO) [2].

Ant colony optimization (ACO) is one of the nature-inspired computing techniques which deciphers the problem by simulating the behaviors of real ant colonies. ACO can be used for classification and rules extraction to solve real-life problems. The goal of ACO based learning is not only to discover accurate knowledge but also interpretable rules for the user.

Comprehensibility is important whenever discovered knowledge will be used for supporting a decision made by a human user. Ant colony based software fault prediction model have been used for comprehensive rule generation. The proposed model has a higher accuracy rate and results are comprehensive and easily interpretable by software managers and developers.

The rule base generated by ACO is expressed in the form of IF-THEN rules, as follows: it IF $<$ conditions $>$ THEN $<$ class $>$ .

The rule-based generation by ACO classification can guide the user in an intuitive comprehensible way [3].

The rest of the paper is ordered as follows. Section 2 presents a brief overview of related work. Section 3 discusses the main characteristics of ACO algorithms. Section 4 introduces the software projects used in this study. Section 5 devoted to experimental design and Section 6 reports the performance of the proposed algorithm in terms of accuracy and area under the Receiver Operating Characteristics Curve (AUC). Also, comparative studies with benchmark machine learning algorithm across nine data sets have been performed. Finally, conclusion and directions for future research are given in the closing section.

2. Related work

Machine learning techniques have been extensively used to build software fault prediction models in the domain of software quality. Software metrics as a software quality indicator is used by researchers for the prediction of faulty and not faulty modules. Some of the recent studies in the field are:

An Eclipse-based SFP tool for Java programs using Naive Bayes as a plug-in is developed to present a practical view to software fault prediction problem to show the effect of combined software metrics with software fault data [4].

Czibula et al. have proposed a novel classification model based on relational association rules mining for software fault prediction [5]. Erturk and Sezer have applied an Adaptive Neuro-Fuzzy Inference System (ANFIS) for the software fault prediction problem [1]. Abaei et al. have automated software fault detection model using semi-supervised hybrid self-organizing map (HySOM). HySOM is a model based on artificial neural network and self-organizing map. The HySOM is able to predict defect prone (dp) or not-defect-prone (ndp) modules in a semi-supervised manner using software measurement threshold values in the absence of quality data. In semi-supervised HySOM, the role of expert for identifying fault prone modules becomes less critical and more supportive [6]. A new software defect prediction model based on atomic class-association rule mining (ACAR) is developed using data preprocessing and rule model building for improved defect prediction. They demonstrated that ACAR is better than CBA2 in terms of AUC, G-mean, Balance, and understandability. In addition, the average AUC of ACAR is increased by 2.9% compared with CBA2 [7].

Various ant based approaches have been developed for classification and rule generation for different applications. In particular [8] new ant-based classification technique is proposed, named AntMiner $+$ and compared with state-of-the-art classification techniques, such as C4.5, RIPPER, and support vector machines. A study presented for cAnt-Miner2 algorithm to cope with continuous features that were introduced by the cAnt-Miner algorithm [9]. The experiments highlight the influence of each parameter and their combinations on the cAnt-Miner2 algorithm, in terms of accuracy, simplicity and computational cost. Lessmann et al. [10] argued that the assessment of fault prediction techniques should not be done just by the performance of the predictor. For this, one should also take into account other factors such as computational complexity, ease of use, and, more importantly, comprehensibility of the predictor. Singh et al. [11] have developed a framework for automatic extraction of human understandable fuzzy rules for software fault detection/classification. This is an integrated framework to identify simultaneously useful determinants (attributes) of faults and fuzzy rules using those attributes.

In this work, ACO based learner is trained using modules metrics and fault data in order to generate a rule based expert system to classify the faulty and not faulty software module. In this paper, we demonstrate the application of ACO for rule extraction and its subsequent use for software fault classification.

The main contributions of our empirical study are highlighted in the following aspects:

1.
An attempt to perform ACO based rule generation using ROC measure as the objective function for the quality of rule and empirical study with feature selection methods.
2.
Extensive experiments on the Promise repository dataset and AEEEM datasets with a comparative study to prove the effectiveness of the proposed model over the other state-of-art techniques.

3. Ant colony optimization algorithm for fault prediction

An ACO is a system based on reproduction of the usual behavior of ants, including mechanisms of collaboration and adaptation. ACO is an optimization technique based on the investigation of the foraging behavior of ants [15].

Figure 1.

Ant colony-based learning algorithm.

Ant colony based optimization techniques are used to solve numbers of problems ranging from the Traveling Salesman problem to the difficult data clustering and classification problems. Parpinelli et al. [3] have proposed a novel algorithm called Ant-Miner., for data mining tasks. In this work, the concept of Ant-Miner algorithm has been adopted to obtain the comprehensive rules that can be used by software developer as the necessary guideline for identification of faulty and not fault-prone modules. Ant-Miner algorithm’s brief introduction is given below.

In an ACO algorithm, each ant incrementally constructs/modifies a solution for the target problem. In our case, the target problem is the discovery of comprehensive rules for software fault prediction. As discussed in the introduction, each classification rule has the form

IF $<$ term 1 and term 2 and … $>$ THEN $<$ class $>$

Each term is a triple $<$ feature, op, value $>$ , where value belongs to the domain of feature. The operator (op) element in the triple is a relational operator. The proposed application of Ant-Miner copes with continuous features so that the operator element in the triple is always equality. Continuous features are discretized. The ant miner used in this work deposits pheromones on edges instead of vertices. Discretization method used by [12] for discretization of the continuous feature are used in these models. A concise description of Ant-colony based is shown in Fig. 1. Ant-Miner starts with sequential covering approach to discover a set of rules discovering all, or almost all, the training cases. In starting, the list of discovering rules are blank. Each iteration of the inner loop discovers one rule. This rule is added to the list of discovered rules, and the training instances that are covered correctly by this rule are removed from the training set. Iteratively this process is performed while the number of uncovered training cases is greater than a user-specified threshold, called Max_allowed_uncoverd_cases. Each iteration of the inner loop performs three tasks, rule generation, pheromone updating, and rule pruning. The algorithm first starts a rule with no term in its predecessor and adds one term at a time to its current partial rule. The current partial rule constructed by an ant corresponds to the current partial path followed by that ant. Similarly, the option of a term to be added to the current incomplete rule corresponds to the choice of the direction in which the current path will be extended. The choice of the term to be added to the current partial rule depends on both a problem-dependent heuristic function ( $\eta$ ) and on the amount of pheromone ( $\tau$ ) associated with each term, as will be discussed in detail in the next sections.

In order to remove extraneous terms, rule Ri constructed by Ant ${}_{i}$ is pruned. The amount of pheromone in each trail is increased in the trail according to the quality of rule and decreasing the pheromone in the other trails by simulating the pheromone evaporation [3]. Then another ant starts to construct its rule, using the new amounts of pheromone to guide its search. This process is repeated until one of the following two conditions is met.

No of constructed rules $\geqslant$ user-specified threshold No_of_ants.

If no more features to be added to the rule antecedent.

After the outer loop is completed, the preeminent rules extracted by all ants are added to the category of revealed rules. The reinitializing of all trails with an equal quantity of pheromone is performed for the new iteration of the inner loop.

In standard definition of ACO [15], a population is defined as the set of ants that build solutions between two pheromone updates. Next we present the rule generation process.

Let term $i j$ be a rule condition of the form $R_{i}=V_{i,j}$ , where $R_{i}$ is the $i^{\text{th}}$ feature and $V_{i,j}$ is the $j^{\text{th}}$ value of the domain of $R_{i}$ . The probability of a term ${ij}$ to be selected for the current partial rule is

$\displaystyle P_{ij}=\frac{\eta_{ij}\cdot\tau_{ij}(t)}{\sum\limits_{i=1}^{a}{x% _{i}\sum\limits_{j=1}^{b_{i}}(\eta_{ij}\cdot\tau_{ij}(t))}}$ (1)

Where $\eta_{i,j}$ heuristics function for the term ${ij}$ , a is the total number of features and $b_{i}$ is the number of values in the $i^{\text{th}}$ feature. The heuristic function used by the Ant-based mining algorithm is based on information theory. Every possible term has a heuristic value $\eta_{i,j}$ for the $j^{\text{th}}$ value of the feature $i$ that estimates the predictive accuracy of the term.

The entropy of every term is calculated as follows

$\displaystyle\quad H(W|R_{i}=V_{ij})$ (2) $\displaystyle=\sum_{w=1}^{k}(P(w|R_{i}=V_{ij}).\log_{2}P(w|R_{i}=V_{ij}))$

Where $R_{i}$ is the $i^{\text{th}}$ feature, and $V_{i,j}$ is the $j^{\text{th}}$ value of the $i^{\text{th}}$ feature and $k$ is the number of class features. $P(w|R_{i}=V_{i,j})$ is the empirical probability of observing the class $w$ , given that feature $A i$ is equal to value $V_{i,j}$ .

After computation of the entropy of the terms, the heuristic value $\eta_{i,j}$ of each term is computed and normalized using the following equation.

$\displaystyle\eta_{ij}=\frac{\log_{2}k-H(W|R_{i}=V_{i})}{\sum\limits_{i=1}^{a}% x_{i}.\sum\limits_{j=1}^{b_{i}}(\log_{2}k-H(W|R_{i}=V_{ij}))}$ (3)

Where $x_{i}$ is set to 1 if feature $i$ has not been used in the current partial rule, and is set to 0 otherwise. $k$ is the total number of classes for the training data, and $H(W|R_{i}=V_{i,j})$ is the entropy value of the term ${}_{i,j}$ .

3.1 Pheromone updating

After an ant discovers a rule and the rule is pruned, the pheromone level of each of the terms in the rule is incremented by a factor of the rule’s quality. The rule’s quality is computed empirically on the set of remaining uncovered training examples using the ROC. ROC analysis is commonly applied in visualizing model performance, decision analysis, and model combinations with great scope and applications. As the performance index of a classifier, the AUC of a ROC can be calculated as follows [13, 14].

$\displaystyle Q=\textit{ROC}=(1+TP-FP)/2$

where $T P$ denotes the true positives, $T N$ denotes the true negatives, $F P$ denotes the false positives, and $F N$ denotes the false negatives. These values are often displayed in a confusion matrix. After the quality of the rule is computed, the pheromone is added to each term $i$ , $j$ in the rule as follows:

$\displaystyle\tau_{ij}(t+1)=\tau_{ij}(t)+\tau_{ij}(t)\times Q$

Where $\tau_{i,j}(t)$ is the amount of pheromone on the term ( $i, j$ ) at time $t$ . The algorithm for the Ant miner is as follows.

Ant ${}_{i}$ keeps accumulating one term at a time to its current partial rule until one of the following two stopping criteria is met.

1.
Any term $V_{ij}$ will be added to the rule if the rule covers a number of cases that are smaller than max_allowed_uncovered_ Cases.
2.
All features $R_{i}$ have already been used by the ant, so that there are no more features to be added to the rule antecedent. To avoid invalid rule construction, each feature can occur only once in each rule. Pheromone reduction of Unused Terms is performed as in [3].

Table 1
Characteristics of the software fault data set D’

Source Not faulty module Faulty modules Total no of modules in software Features % faulty Language

CM1 449 49 498 21 9.84 C

KC1 1783 326 2109 21 15.46 C $++$

KC3 415 43 458 39 9.39 Java

PC2 5566 23 5589 36 0.41 C

PC3 1403 160 1563 37 10.24 C

PC4 1280 178 1458 37 12.21 C

MC2 109 52 161 39 32.3 C $++$

MW1 372 31 403 37 7.69 C

In the experiments, we have used the following parameter for rule generation. No of ants equal to 3000, Min cases per rule is 10, and Maximum uncovered cases are 10.

In order to predict software fault of the module, the discovered rules are applied in the manner they were identified. The case is assigned the class predicted by the first rules consequent. There may be a possibility that no rule of the list covers the new case. In this circumstance, the new case is predicted as the majority class by a default rule.
4. Software project used in the study

Source	Not faulty module	Faulty modules	Total no of modules in software	Features	% faulty	Language
CM1	449	49	498	21	9.84	C
KC1	1783	326	2109	21	15.46	C $++$
KC3	415	43	458	39	9.39	Java
PC2	5566	23	5589	36	0.41	C
PC3	1403	160	1563	37	10.24	C
PC4	1280	178	1458	37	12.21	C
MC2	109	52	161	39	32.3	C $++$
MW1	372	31	403	37	7.69	C

To evaluate the effectiveness of our approach, we used software fault dataset in the study originate from the public PROMISE data repository [15]. To investigate whether dataset types affect the conclusions that are drawn from the NASA dataset, we also used four projects of AEEEM dataset [16]. This dataset was developed in a different setting compared with the NASA dataset and performs defect prediction at the class level. Features in AEEEM dataset include source code metrics, such as the change metrics, source code metrics, and the entropy of source code metrics and churn of source code metrics. The only common metric between NASA and AEEEM datasets is lines of code. Table 1 shows the statistical information of the software projects used in the first and second experiments. Table 4 shows the statistical information of AEEEM projects used in experiment3.

5. Experiment design

To evaluate the performance of Ant colony based miner and compare it with other benchmark machine learning algorithms we set up the experimental study as follows.

Table 2
AUC comparison learning schemes for D’ dataset

Data set	ACO	NB	J48	Random forest
CM1	0.84	0.70	0.58	0.70
PC1	0.91	0.79	0.67	0.80
KC1	0.85	0.79	0.69	0.79
KC3	0.78	0.68	0.60	0.72
MC2	0.68	0.73	0.52	0.71
MW1	0.88	0.74	0.52	0.69
PC2	0.98	0.87	0.51	0.70
PC3	0.87	0.76	0.64	0.81
PC4	0.88	0.84	0.77	0.91
Avg	0.852	0.766	0.611	0.758

We have conducted three experiments to show the effectiveness of the proposed ant based system. The first one is on the data set shown in Table 1. In this, we performed the proposed ACO based classification and compared the results with the other benchmark algorithms like J48, Naïve Bayes and Random forest. Figure 2 shows the proposed model and the implementation steps in the block diagram. The result of experiment one is shown in Table 2. Feature selection has been widely used in machine learning tasks to make a model with a small number of features which improves the classification accuracy. In recent years, a large number of feature selection approaches have been proposed. In experiment 2 we have investigated the effect of feature selection by selecting the top five features and passing them to the machine learning algorithms.

Figure 2.

Implementation of the proposed software fault prediction model.

In order to compare with the effect of features, we have employed mRMR (minimum redundancy maximum relevance) method feature selection technique. We first extracted the top five features from all N source {s1, s2, …, sn}, using chosen “mRMR” [17] (minimum redundancy maximum relevance) method. This method selects features, without using any classification algorithm. It takes the whole set of features X, the subset S of m features that has the maximal relevance criterion is the subset that satisfies the maximal mean value of all mutual information values among individual features, and the equation is:

$\displaystyle\text{max}D(S,c),D=\frac{1}{m}\sum_{x_{i}\in S}I(x_{j},\omega)$ (4)

The subset $S$ of $m$ features that has the minimal redundancy criterion is the subset that satisfies the minimal mean value of all mutual information values among all pairs of features, and the equation is denoted by:

$\displaystyle\text{min}R(S),R=\frac{1}{m^{2}}\sum_{x_{i}x_{j}\in S}I(x_{i},x_{% j})$ (5)

Thus, the complete “mRMR” feature selection is:

$\displaystyle\text{mRMR}=\text{max}_{s}\left[\frac{1}{|m|}\sum_{x_{i}\in S}I(x% _{i};\omega)\right.{}\left.-\frac{1}{|m|^{2}}\sum_{x_{i};x_{j}\in S}I(x_{i};x_% {j})\right]$ (6)

Where, $S$ is the set of features for the class $c$ . $x_{i}$ and $x_{j}$ are individual feature depending on class.

The equation shows the mutual information values among the feature $x_{i}$ with respect to class and the average value of all mutual information values among the feature $x_{i}$ and the feature $x_{j}$ .

Top five features are selected from D’ dataset, and a comparative study is performed to investigate the effect of features in classification. Table 3 shows the result of the comparative study with top five selected features. The goal of our approach is to build a high-quality training set from the original dataset, which may contain the most important features with the objective to improve the performance of classification models used for SFP.

Finally, in the third experiment, we have employed the proposed approach on the AEEEM data set which consists of include source code metrics, such as the change metrics, source code metrics, the entropy of source code metrics and churn of source code metrics.

The implementation of the proposed model is illustrated in Fig. 2, where 10-fold cross-validation is used to evaluate the performance of the algorithms.

Given training software fault dataset $S=(X,Y)$ , let $X$ denote the instances along with their features and $Y$ denote the corresponding faulty and not faulty labels. We randomly divide the members of the dataset into $n$ subsets for an n-fold cross-validation.

$\displaystyle X=U_{i=1}^{n}X_{i}$

such that $X_{i}\cap X_{j}=\phi\forall i\neq j$ . We then use one of the subsets $X_{\textit{test}}$ as the testing data and the union of remaining datasets as the training data $X_{TR}=\cup_{i=1}^{n}X_{i},i\neq k$ . Denote the training dataset. Now we divide $X_{TR}$ into $c$ subsets $X_{TR,i}$ $i=1\ldots c$ such that $X_{TR}=\cup_{i=1}^{c}X_{TR,i}$ , $X_{TR,i}\cap X_{TR,j}=\phi\forall i\neq j$ . Then we perform ACO to learn from training data. After learning has been performed, testing takes place on $X_{\textit{test}}$ subset.

The detailed process is described with pseudo code in the following procedure (Fig. 3), which consists of other benchmark learning algorithms.

Figure 3.

Software fault prediction model evaluation.

To get fairer and more informative measure than comparing their misclassification rates area under receiver operating characteristic (AUC) is used in this study. Operating Characteristic (ROC) curves compare the classification performance by plotting the TP rate on y axis and FP rate on X axis across all the possible experiments. A typical ROC curve has a concave shape with (0, 0) as the beginning and (1, 1) as the endpoint. The ideal point on the ROC curve would be the one when no positive examples are classified incorrectly and negative examples are classified as negative. Every model gets different values for area under curve. AUC is used to get the complete order of model performance and is independent of the decision criterion selected and prior probabilities. The AUC comparison can establish a dominance relationship between classifiers [18]. The bigger the area under the curve, the better the model is. As opposed to other measures, the area under the ROC curve (AUC) does not depend on the imbalance of the training set [19].

All the results of the comparison were obtained using Intel core $i_{7}$ with a clock rate of 3.4 GHz and 4 GB of main memory. Figure 3 shows the pseudo code for software fault prediction model implementation and evaluation.

6. Result analysis

In this section, we present the experimental results in terms of the area under the receiver operating characteristic (ROC). Table 2 records the result of nine projects. It can be seen that out of nine ant miner has outperformed in eight cases when compared with the Naïve Bayes, nine out of nine in case of J48 and seven out of Nine in case of Random Forest. For CM1 Ant-based classification algorithm has outperformed than Naïve Bayes and J48. KC1, KC3, MW1 PC1, PC2 and PC3 also has the best accuracy result for Ant-based classification algorithm in comparisons with all other algorithms. From Table 2, it can be observed that ACO is outperforming in terms of average AUC for D’ dataset.

Let us now look at the type of rules extracted by the Ant-based system in a typical run for the PC1 data set.

R1:
IF PERCENT_COMMENTS $\leqslant$ 18.9 AND LOC_BLANK $\leqslant$ 7.5 THEN Not Faulty
R2:
IF NUM_OPERATORS $\leqslant$ 226.5 AND NOR- MALIZED_CYLOMATIC_COMPLEXITY $>$ 0.075 AND PERCENT_COMMENTS $>$ 3.795 AND DESIGN_DENSITY $>$ 0.26 THEN Not Faulty
R3:
IF LOC_CODE_AND_COMMENT $\leqslant$ 8.0 AND LOC_BLANK $>$ 9.5 AND LOC_COM- MENTS $\leqslant$ 38.5 THEN Not Faulty
R4:
IF LOC_TOTAL $>$ 63.5 AND ESSENTIAL_ COMPLEXITY $\leqslant$ 12.5 AND HALSTEAD_ EFFORT $\leqslant$ 214748.3647 THEN Faulty
R5:
IF CONDITION_COUNT $>$ 13.0 AND ESSENTIAL_COMPLEXITY $\leqslant$ 18.5 THEN Not Faulty
R6:
IF CYCLOMATIC_COMPLEXITY $\leqslant$ 2.5 AND NORMALIZED_CYLOMATIC_COM- PLEXITY $>$ 0.05 THEN Not Faulty
R7:
IF BRANCH_COUNT $\leqslant$ 59.0 AND DESIGN_ COMPLEXITY $>$ 1.5 THEN Faulty

The framework provides a set of intuitive and comprehensible rules for the user and can provide insights into the process.

Figure 4.
Comparative analysis using radar chart.

Form the comparative analysis; it can be seen that ACO is outperforming with other benchmark algorithms and also provides comprehensive rules. Figure 4 shows the radar chart to show a better view of the overall performance of the proposed model.

The results of experiment 2 with five best feature selected by mRMR (minimum redundancy maximum relevance) method feature selection technique is shown in Table 3.

Table 3
AUC comparison learning schemes for D’ dataset with five best features

Data set ACO NB J48 Random forest

CM1 0.85 0.73 0.52 0.75

PC1 0.92 0.81 0.61 0.83

KC1 0.85 0.77 0.75 0.76

KC3 0.78 0.7 0.64 0.75

MC2 0.7 0.76 0.62 0.75

MW1 0.88 0.77 0.51 0.69

PC2 0.98 0.78 0.58 0.84

PC3 0.87 0.74 0.67 0.84

PC4 0.88 0.83 0.85 0.9

Avg 0.86 0.77 0.64 0.79

It can be seen that out of nine ant miner has outperformed in eight cases when compared with the Naïve Bayes. Comparing column 2 and 4 we find that 9 out of 9 cases the ACO rule perform better than J48 Algorithm. Comparing column 2 and 5 we find that 6 out of 9 cases the ACO rule perform better than Random Forest Algorithm. From Table 3, it can be observed that ACO is outperforming in terms of average AUC for D’ dataset. Also by comparing the results of Tables 2 and 3, it can be seen that the average performance of ACO is improved whereas the performance of all other three algorithms slightly degraded.

In order to show the effectiveness of the proposed algorithms experiment 3 is performed on AEEEM dataset. The properties od AEEEM dataset is shown in Table 4. The results of experiment 3 on data set with different metrics set used from the NASA projects is shown in Table 5.

Table 4
Projects in AEEEM dataset

Project Number of defective instances Number of defect-free instances % of defective instances

EQ 129 195 39.81%

JDT 206 791 20.66%

LC 64 627 9.26%

ML 245 1617 13.16%

PDE 209 1288 13.96%

Table 5
AUC comparison learning schemes for AEEEM dataset

Data set ACO NB J48 Random forest

EQ 0.75 0.69 0.58 0.67

JDT 0.90 0.79 0.67 0.81

LC 0.74 0.67 0.59 0.68

ML 0.72 0.68 0.60 0.72

PDE 0.66 0.74 0.60 0.69

Avg 0.82 0.737 0.596 0.737

It can be seen from Table 5 that out of five ant miner has outperformed in all five cases when compared with the Naïve Bayes. Comparing column 2 and 4 we find that 5 out of 5 cases the ACO rule perform better than J48 Algorithm. Comparing column 2 and 5 we find that 3 out of 5 cases the ACO rule perform better than Random Forest Algorithm. From Table 3, it can be observed that ACO is outperforming in terms of average AUC for D’ dataset. Also, it proves that the ACO based learner is outperforming not only nine NASA datasets but also on the other AEEEM dataset.
7. Conclusion and future work

Data set	ACO	NB	J48	Random forest
CM1	0.85	0.73	0.52	0.75
PC1	0.92	0.81	0.61	0.83
KC1	0.85	0.77	0.75	0.76
KC3	0.78	0.7	0.64	0.75
MC2	0.7	0.76	0.62	0.75
MW1	0.88	0.77	0.51	0.69
PC2	0.98	0.78	0.58	0.84
PC3	0.87	0.74	0.67	0.84
PC4	0.88	0.83	0.85	0.9
Avg	0.86	0.77	0.64	0.79

Data set	ACO	NB	J48	Random forest
EQ	0.75	0.69	0.58	0.67
JDT	0.90	0.79	0.67	0.81
LC	0.74	0.67	0.59	0.68
ML	0.72	0.68	0.60	0.72
PDE	0.66	0.74	0.60	0.69
Avg	0.82	0.737	0.596	0.737

In this paper, an investigation is performed on the software fault data by using ACO based classification algorithm with the objective of improving the rule quality with of ROC. Lessmann et al. [10] argued that the assessment of fault prediction techniques should not be done just by the performance of the predictor. The other factors to be considered are computational complexity, ease of use, and, more importantly, comprehensibility of the predictor. The proposed system has not only outperformed in the performance of AUC but also fulfills all other properties especially the comprehensibility. The effectiveness of the method is demonstrated using data sets from the PROMISE repository and AEEEM datasets. To generate the best rule base, we have used AUC for rule quality updation. The proposed method selects the best software metrics and generates rules using these important software metrics to achieve better performance. The results of our experiments demonstrate that the proposed ACO rule-based classification approach achieved better performance than C4.5, random forest learner, and Naive Bayes classifier. Also, our model is a better choice compared to other opaque models for gaining insight about different factors which drive software faults. This approach provides intuitive rule which can be used as a software quality system. This provides the system designer and developer to follow a set of rule in design and development. In future work, we plan to modify the algorithm in such a way that it generates more interpretable rule sets for categorization of software faulty modules.

Footnotes

Acknowledgments

This research is partially supported by the Chhattisgarh Council of Science and Technology (CGCOST) under Grant 8068/CCOST. The author would like to acknowledge the support of the funding organizations.

References

Erturk

and Sezer

E.A.

, A comparison of some soft computing methods for software fault prediction, Expert Syst. Appl. 42(4) (2015), 1872–1879.

Jin

and Jin

S.-W.

, Prediction approach of software fault-proneness based on hybrid artificial neural network and quantum particle swarm optimization, Appl. Soft Comput. 35 (2015), 717–725.

Parpinelli

R.S.

Lopes

H.S.

and Freitas

A.A.

, Data mining with an ant colony optimization algorithm, IEEE Trans. Evol. Comput. 6(4) (2002), 321–332.

Catal

Sevim

and Diri

, Practical development of an Eclipse-based software fault prediction tool using Naive Bayes algorithm, Expert Syst. Appl. 38(3) (2011), 2347–2353.

Czibula

Marian

and Czibula

I.G.

, Software defect prediction using relational association rule mining, Inf. Sci. (Ny). 264 (2014), 260–278.

Abaei

Selamat

and Fujita

, An empirical study based on semi-supervised hybrid self-organizing map for software fault prediction, Knowledge-Based Syst. 74 (2015), 28–39.

Shao

Liu

Wang

and Li

, A novel software defect prediction based on atomic class-association rule mining, Expert Syst. Appl. 114 (2017), 237–254.

Martens

De Backer

Haesen

Vanthienen

Snoeck

and Baesens

, Classification with ant colony optimization, IEEE Trans. Evol. Comput. 11(5) (2007), 651–665.

Otero

F.E.B.

Freitas

A.A.

and Johnson

C.G.

, cAnt-miner: An ant colony classification algorithm to cope with continuous attributes, Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), 5217 LNCS, 2008, 48–59.

10.

Lessmann

Baesens

Mues

and Pietsch

, Benchmarking classification models for software defect prediction: A proposed framework and novel findings, IEEE Trans. Softw. Eng. 34(4) (2008), 485–496.

11.

Singh

Pal

N.R.

Verma

and Vyas

O.P.

, Fuzzy rule-based approach for software fault prediction, IEEE Trans. Syst. Man, Cybern. Syst. 47(5) (2017), 1–12.

12.

Singh

and Verma

, An Investigation of the Effect of Discretization on Defect Prediction Using Static Measures, in: 2009 Int. Conf. Adv. Comput. Control. Telecommun. Technol., 2009, pp. 837–839.

13.

Park

and Pedrycz

, The design of polynomial function-based neural network predictors for detection of software defects, 229 (2013), 40–57.

14.

Hong

Member

Chen

Member

and Harris

C.J.

, A Kernel-Based Two-Class Classifier for Imbalanced Data Sets, 18(1) (2007), 28–41.

15.

Sayyad Shirabad

and Menzies

T.J.

, The PROMISE Repository of Software Engineering Databases, School of Information Technology and Engineering, University of Ottawa, Canada, Available: http://promise.site.uottawa.ca/SERepository, 2007.

16.

Ambros

M.D.

Lanza

and Robbes

, Evaluating defect prediction approaches: a benchmark and an extensive comparison, 2012.

17.

Ding

and Peng

, Minimum redundancy feature selection from microarray gene expression data, 3(2) (2003), 523–528.

18.

Lee

S.S.

, Noisy replication in skewed binary classification, Comput. Stat. Data Anal. 34 (2000), 165–191.

19.

Kolcz

J.A.

and Chowdhury

, Data duplication: an imbalance problem? Work. Learn. from Imbalanced Data Sets ICML, 2003.

Project	Number of defective instances	Number of defect-free instances	% of defective instances
EQ	129	195	39.81%
JDT	206	791	20.66%
LC	64	627	9.26%
ML	245	1617	13.16%
PDE	209	1288	13.96%