Improved software cost estimation models: A new perspective based on evolution in Dynamic Environment

Abstract

Software cost estimation is the process of predicting the most realistic and valid amount of effort necessary for the development of any software. The cost estimation of any software is a difficult assignment due to the involvement of many factors that anyhow affect the estimation process. In literature, many cost estimation models have been developed for more than a decade to maintain accuracy in estimation of the cost of software projects. But, it is found that these models are inefficient to estimate the exact cost of software development because of uncertainties and lack of accuracy associated with them. In this paper, Alla F. Sheta models have been taken for optimization, which are the modified versions of the very famous Boehm’s COCOMO model. Parameters of the Sheta models have been tuned enough by the proposed method to estimate and minimize the consequences of different factors that affect the overall software development cost. Experimental work has been carried out in MATLAB environment and analysis of results is performed on the basis of Magnitude of Relative Error (MRE), Prediction (PRED) at 0.25, Value Accounted For (VAF) and Mean Magnitude of Relative Error (MMRE). Estimation accuracy of the proposed work is tested on NASA software project dataset. It is found that the proposed method shows good estimation capabilities over other state-of-the-art cost estimation models.

Keywords

Software cost estimation EAMD COCOMO model NASA dataset natural phenomena

1 Introduction

During the software development, objective of software cost estimation model is to accurately estimate the cost, time, effort and expertise of working staff needed for the project in early stage of the software development life cycle. Accurate estimation is very important because underestimation or overestimation may negatively affect the overall estimation process. Underestimating the cost may affect the quality and performance of the software product, while overestimation may cause of misuse of funds.

Cost estimation techniques are categorized into algorithmic and non-algorithmic techniques. Constructive Cost Model (COCOMO) [1, 2] and software life cycle management [3] comes under algorithmic techniques.

The COCOMO model is a mathematical relationship among software development time and effort as a function of program size and maintenance effort [4, 5].

Estimation of program size is expressed in kilo lines of code (KLOC). A basic regression approach is applied with parameters in the COCOMO model. These parameters are obtained from the statistical analysis of historical project datasets and current project features.

The non-algorithmic techniques include expert judgment, price-to-win and soft computing tech-niques. Fuzzy logic, neural networks and evolutionary computation comes under soft computing techniques. Fuzzy logic and neural networks are widely used to build the effective cost estimation models [6, 7]. Evolutionary computation techniques are also applicable in software cost estimation and different models based on these techniques are introduced with inherent novelty [8].

Several cost estimation techniques have been developed in recent years, yet variation in software requirements make the estimation more challenging for the software development. Hence, the need of precise, valid and reliable cost estimation is an on-going challenge in software engineering.

In general, objective of a software cost estimation technique is to obtain a clear estimation of the cost needed for the software development. The basic input for software cost estimation is the size of the software to be developed and the size is then converted into effort needed for software development.

The measurement of effort is done in person months (PM), which can easily be converted into software cost. In literature, several cost estimation models are available that shows the relationship between size and effort.

Project managers maintain a track record of the project progress and assure about the better utilization of the available resources. In this scenario, effort estimation has a prime importance among all available cost drivers and it is affected by Developed Line of Code (DLOC). All formal statements and program instructions are the prime elements of DLOC which are included in it.

In this paper, Environmental Adaption Method for Dynamic Environment EAMD [10] is used to optimize the parameters of the Sheta models [9] by providing the generalized optimal value for each parameter. These parameters are optimized in such a way that accurate cost estimation of different software projects can be achieved.

In our work, a new architectural framework has been proposed for cost estimation. This architecture represents the application of Magnitude of Relative Error (MRE) and Mean Magnitude of Relative error (MMRE) with EAMD to assign the quality value to each parameter of the Sheta models that helps to improve the overall performance of the existing COCOMO model and makes it more reliable and abbreviates the negative effect of noise in measurement of software cost with high accuracy.

Rest of the paper is organized as follows. Section 2 shows the related work on software cost estimation. Background details are provided in Section 3. Section 4 explains the work that has been done to improve the estimation accuracy of different types of projects. Section 5 is used for result analysis and finally the paper is concluded in Section 6.

2 Related work

Many models have been developed for more than a decade to estimate the cost of software as per changing requirements. Several assessment techniques have been applied to find out the approximate value of the factors such as effort in man-months, time and expertise needed for any software to become functional. A lot of research is done and many techniques are already explored to solve the estimation problems. This section discusses some of the very concerned and useful work that has been used for cost/effort estimation.

Pedrycz et al. [11] used fuzzy sets to develop the models for software cost estimation. They introduced the granular models of cost estimation. They also proposed the augmentation of well-known class of COCOMO cost estimation models.

Software cost estimation model using the Genetic Programming (GP) is proposed by Sheta et al. in year 2010 [12]. In this model they applied the effect of both the developed line of code and the methodology used throughout the overall development. They applied this approach to find out the estimated effort on some NASA software projects and they found good results. Performance of the GP based proposed model is compared with the well-known models in the literature shows its superiority over other models.

A hybrid approach including Ridge Regression (RR) with Genetic Algorithm (GA) is proposed by Papatheocharous et al. [13] in year 2010. They applied their hybrid cost model on ISBSG dataset of software project samples and the output received shows that by eliminating the repeated attributes accuracy of results may be improved.

Sheta [9] established two new models for estimating the parameters of COCOMO model using genetic algorithm. He also provided the improved version of the very popular COCOMO model to examine the effect of software development adopted methodology in effort computation. He also estimated the effort necessary for the development of the software projects.

Borte et al. [14] established software effort estimation as collective accomplishment. This paper also shows their effort to find out that how the software effort estimation should be performed by the complex and sense making actions rather than by using the assumed information.

In year 2010, Basha et al. [15] published their work and established Empirical Software Effort Estimation (ESEE) model. During the study of ESEE, it is concluded that no single technique is best for all situations, so a valid and careful comparison is required among the results generated by all techniques to bring out valid estimates.

Attarzadeh et al. [16] introduced a fuzzy logic based effort estimation model that proposed a new approach that applies fuzzy logic for software effort estimation which helps in reducing the long term estimation process which is generally needed in conventional estimation techniques.

Kumar et al. [17] applied fuzzy estimation theory and neural networks in software engineering project management and control. For that they used Manpower Build up Index (MBI) estimation model. Selection process in MBI estimation model is based upon 64 different Fuzzy Associative Memory (FAM) rules. The fuzzy estimation theory is used to model the three fuzzy parameters like inverse application complexity (IAC), task concurrency (TC) and schedule pressure (SP) in estimating the MBI. They also show the working of fuzzy FAM in software project management and in estimation of the MBI.

In 2008, Idri et al. [18] proposed software cost estimation models using radial basis function neural network. They addressed imprecision and uncertainty as main issue in their work. They described that estimating software development effort remains a complex problem, and one which continues to attract considerable research attention. Improving the accuracy of the effort estimation models available to project managers would facilitate more effective control of time and budgets during software development.

3 Background details

3.1 Constructive Cost Estimation Model (COCOMO)

COCOMO model is a well-known and prominently used cost and schedule estimation model proposed by Dr. Barry W. Boehm in 1981 [19, 20]. This model is based on the analysis of 63 software projects from different domains during 1970s and early 1980s. In COCOMO, effort is expressed as man-months (MM) or person months and software projects are classified into three categories like organic, semi-detached and embedded. These categories are classified based on the complexity of the project. A common approach to estimate the software cost (effort) with COCOMO model is given in Equation 1: $Effort = a * {(size)}^{b}$ (1)

In Equation (1) a and b are constants and values of these constants are determined by regression analysis applied to historical projects and current software projects. The value of a and b is varies for different types of projects such as organic, semidetached and embedded. These parameters are optimized through different algorithmic and non-algorithmic techniques. COCOMO model is easy to understand unlike other models such as SLIM [21] and SEER-SEM [20]. But there are some limitations with COCOMO model as mentioned below:

Attributes and their relationship used to predict software development effort are time dependent and differ for software development environment [22].

Problem in accurate software size estimation in terms of source line of code (SLOC), number of user screens, interfaces, complexity and so on, are the parameters desirable in existing models at very early stage in development process when uncertainty surrounds the most [23].

Inability to handle data that are specified by a range of values categorically and most importantly lack of logical reasoning capabilities and their ability to draw conclusions or make judgments based on recently available data [22, 23].

Due to these limitations many researchers have explored the non-algorithmic techniques to build efficient and valid cost estimation models.

3.2 Estimation of COCOMO model parameters by Sheta models

Sheta proposed evolutionary models using Genetic Algorithm (GA) for estimating the software effort [9]. He applied GA to estimate the parameters of COCOMO effort estimation model. Performance of these models has been tested on 18 software project dataset taken from NASA [27]. He has provided a new estimate for the parameters used in the basic COCOMO model given in Equation 1. The parameters are estimated in such a way that the generalized computation of the developed effort for all projects can be obtained. The details of the changes in COCOMO model are discussed in Section 4.3.

3.3 Environmental Adaption Method for Dynamic Environment (EAMD)

EAMD is a population based, nature inspired and randomized algorithm which works on real valued parameters [10, 24]. It is an improved version of Environmental Adaption Method (EAM) [25, 26]. In EAMD, environment of the search space is very dynamic unlike EAM. Environment in EAM is bounded by the average fitness of the population. While EAMD uses a term “Environmental Window” that is fully dynamic and it is directly managed by the environmental changes used to represent the strength of the environment that supports life. Over a few generations, as environment becomes tough due to environmental constraints, the window size decreases gradually to make an environment dynamic.

The environmental window is used to represent an environment for its inhabitants to survive. The nature of the environment is dependent on the size of the window, if the window size is large, species can easily survive and as the window size decreases gradually the environment becomes tough for its species to survive. So, as the survival is concerned individual changes their phenotypic structure as per environmental changes and gain better fitness over time. If the individuals are not able to adapt the changes, they may no longer survive and eliminated from the environment.

EAMD has two operators named adaption and selection. Here the adaption is applied first on the set of solutions and improves their fitness and form new intermediate set of improved solutions. Adaption improves the fitness of solutions in the range provided by the environmental window (adaption window). Selection is applied on the merging of initial and intermediate improved solutions. This merged solution is sorted and best ‘n’ number of solutions is selected for the next generations. The whole process in the successive generations is continued until either the optimal solution is found or maximum number of fitness evaluation is reached. The working model of EAMD is mentioned in Fig. 1.

Fig.1

Working model of EAMD [24].

3.4 Software cost/effort estimation

Software cost/effort estimation is the challenging task for the software development process and software project management. In order to calculate the value of total effort, many models have been suggested i.e. COCOMO I, COCOMO II, Sheta etc. These models use some adjustable parameters whose value may significantly affect the accuracy of the effort estimation. It includes many factors like labour cost (effort), hardware cost, project category, language, storage constraints, software tools etc. But effort is the most significant and dominating factor among all factors. Effort is the amount of labour (persons) requisite for completing a software project.

3.4.1 Objective

Objective of software project estimation technique is as follows:

Estimation of effort required to complete a project successfully.

To calculate total cost of a project.

3.4.2 Characteristics of a good software cost estimation technique

Some characteristics which are necessary for good software cost estimation technique are as follows:

Support and acceptance of project manager, development team and stakeholders are required.

It should be relied on a good software cost model.

It should be based on the relevant historical projects that support to build a cost effective estimation techniques.

It should be well-defined so that the necessary action can be taken to minimize the risk occurred during the cost estimation.

3.4.3 Challenges

Some basic and most significant challenges for software cost/effort estimation techniques for providing valid and authentic estimation of software project are as follows:

Validity and uncertainty of data.

Limited time to prepare estimate.

Proper use of resources.

To make sure that the data is authentic, valid and unambiguous.

Estimation should be done within time because the available time for estimation is very limited and delay in this may affect the overall development process.

3.4.4 Solving approach

It is highly expected to get the accurate estimate of the cost/effort of software, but there is as no such type of statistical model is available that can accurately predict the cost/effort of software development due to uncertainties and imprecision associated with the software projects. Sheta has mentioned in their work that estimating the software cost/effort is a kind of optimization problem and he has used genetic algorithm in his work to estimate the software cost/effort.

Most of the researchers have explored the domain of nature inspired optimization techniques such as evolutionary computation, swarm intelligence, differential evolution etc. to achieve better estimation models. These techniques have tremendous exploration and exploitation capabilities to handle the accuracy in cost/effort estimation which motivate researchers to improve the performance of existing models. EAMD is such kind of optimization technique that randomly generates solutions with higher level of performance accuracy.

4 Proposed work

The parameters (a, b, c, d) of the models proposed by Sheta over basic COCOMO model are optimized by EAMD to obtain the accurate estimation of the effort. The experiments have been done on the data set of 18 NASA software projects [27]. In the dataset three parameters such as Kilo Developed Line of code (KDLOC), Methodology (ME) and the Measured Effort are considered for the experiment.

4.1 Fitness function

Fitness function is an evaluation criterion which is used to measure the performance of the proposed algorithm. In other words, it can be defined as an objective function that is used to analyze the optimal solution among the obtained feasible solutions pertaining to the problem. Fitness function is vital in optimization problems to decide the validity and effectiveness of the proposed algorithm.

In the proposed work, Mean Magnitude of Relative Error (MMRE) [9, 32] is considered as the fitness function to evaluate the performance of the cost estimation models. EAMD is used to obtain the optimal (minimum) value of the parameters a, b, c and d of the models proposed by Sheta for which MRE and MMRE is minimized as compared to existing models. We have also measured, value accounted for (VAF) [9] and prediction at level L (PRED (L)) [33] to check the accuracy of the proposed work.

4.2 Evaluation technique

Relative Error (RE) is one of the ratio measurement techniques that give error rate by measuring the average of prediction errors in every unit of effort. To do this, it takes project size and provides those projects which have larger absolute error. The measurement techniques like MRE [9, 33], MMRE and PRED [33] are based on the relative error (RE) [9 , 33]. Relative error and measurement techniques are calculated as follows:

$\begin{matrix} RE & = & (measured effort - estimated effort) / \\ measured effort \end{matrix}$ (2)

Magnitude of Relative Error (MRE) is computed by taking the absolute value of the relative error i.e.

$\begin{matrix} MREi & = & abs ((measured {effort}_{i} - estimated \\ {effort}_{i}) / measured {effort}_{i})) \end{matrix}$ (3)

Mean magnitude of relative error (MMRE) is the average of MRE over n observations and can be calculated as follows: $MMRE = \frac{1}{n} \sum_{i = 1}^{n} MREi$ (4)

The quality of the cost estimation technique is determined by the minimum value of MMRE.

Prediction at some level (PRED) is used to measure the performance of estimation technique. PRED with prediction at level L is as follows: $PRED (L) = k / n$ (5)

The value of L is 0.25 which is the standard value for the measurement of PRED. The quality of the estimation method depends on the maximum value of PRED. VAF [9] is another evaluation technique which is used to verify the authenticity of the estimation technique. In VAF, measured value and estimated value is used to measure the working accuracy of estimation technique. The VAF is calculated as follows:

$\begin{matrix} VAF & = & [1 - var (measured effort \\ - estimated effort) / var \\ (measured effort)] \times 100 \end{matrix}$ (6)

Variance (var) in the above equation is calculated as: $\frac{1}{n} \sum_{i = 1}^{n} {(x_{i})}^{2} - {(\frac{1}{n} \sum_{i = 1}^{n} x_{i})}^{2}$ (7)

In Equation 7, ‘x’ is a variable that depends on the outcome of the values received by Equation 6 and ‘n’ denotes the total number of values of x.

4.3 Existing models for optimization

In the late 1970s, Boehm proposed a Constructive Cost Model (COCOMO). This model is simply classified and characterized by the type of projects to be handled. This model includes three category of project classification i.e. organic, semidetached and embedded [4, 31].

In recent years many researchers have contributed their effort to modify and tune Bohem’s model to gain more accuracy in cost estimation of the software projects. Sheta is one of them who used Genetic Algorithm to develop a generalized form of Boehm’s model to compute the effort needed for all types of projects.

Sheta proposed new estimated value of the parameters and also added new parameters into the Boehm’s basic model of cost estimation (see Equation 1). Cost estimation models proposed and tuned by Sheta are as follows:

Model 1: In this model only the values of parameters a and b is optimized by Sheta. $EE = a {(DLOC)}^{b}$ (8)

Model 2: This model includes one new parameter c and a new term methodology from NASA datasets (M). $EE = a {(DLOC)}^{b} + c (M)$ (9)

In the second model, effect of methodology (M) is considered as an element that contributes its effect in computation of the software development effort. This model is based on the linear model structure development process. The prediction quality is improved by the addition of the effect of measured effort (ME) with basic COCOMO model.

In the above model a, b and c is the three parameters optimized by Genetic Algorithms (GAs) to provide the accurate estimation of the development effort required for the software projects. In this model ME is linearly related to the effort. This model is further improved by adding a bias term which is very similar to the classes of regression models to stabilize the model. This also helps to minimize the effect of noise in effort measurements. The previous model is re-estimated by the new model given below:

Model 3: This model takes one additional biased parameter d with the Equation number (9). $EE = a {(DLOC)}^{b} + c (M) + d$ (10)

In the above equations, EE and DLOC stand for estimated effort and developed line of code (size) respectively.

4.4 Software cost estimation using EAMD

The objective of the proposed work is to find the generalized optimal value of all parameters and to provide highest accuracy in the estimation of the effort required for all types of software projects. The experiments have been conducted on the dataset shown in Table 1. This table shows the measured effort and the corresponding line of code and methodology used for 18 software projects. Table 2 shows the terms used in the proposed work which have already been used by Sheta [9].

Table 1
NASA Data for 18 software projects [27]

Project No. KDLOC Methodology (M) Measured Effort

1. 90.2000 30.0000 115.8000

2. 46.2000 20.0000 96.0000

3. 46.5000 19.0000 79.0000

4. 54.5000 20.0000 90.8000

5. 31.1000 35.0000 39.6000

6. 67.5000 29.0000 98.4000

7. 12.8000 26.0000 18.9000

8. 10.5000 34.0000 10.3000

9. 21.5000 31.0000 28.5000

10. 3.1000 26.0000 7.0000

11. 4.2000 19.0000 9.0000

12. 7.8000 31.0000 7.3000

13. 2.1000 28.0000 5.0000

14. 5.0000 29.0000 8.4000

15. 78.6000 35.0000 98.7000

16. 9.7000 27.0000 15.6000

17. 12.5000 27.0000 23.9000

18. 100.8000 34.0000 138.3000

Project No.	KDLOC	Methodology (M)	Measured Effort
1.	90.2000	30.0000	115.8000
2.	46.2000	20.0000	96.0000
3.	46.5000	19.0000	79.0000
4.	54.5000	20.0000	90.8000
5.	31.1000	35.0000	39.6000
6.	67.5000	29.0000	98.4000
7.	12.8000	26.0000	18.9000
8.	10.5000	34.0000	10.3000
9.	21.5000	31.0000	28.5000
10.	3.1000	26.0000	7.0000
11.	4.2000	19.0000	9.0000
12.	7.8000	31.0000	7.3000
13.	2.1000	28.0000	5.0000
14.	5.0000	29.0000	8.4000
15.	78.6000	35.0000	98.7000
16.	9.7000	27.0000	15.6000
17.	12.5000	27.0000	23.9000
18.	100.8000	34.0000	138.3000

Table 2

Terms used for software cost estimation

Notation	Value
Population size (PS)	18
Maximum Generation	100
Dimension (D)	4
Parameters	Range (search domain)
Search Domain for a	0:10
Search Domain for b	0.3:2
Search Domain for c	–0.5:0.5
Search Domain for d	0.20

EAMD is applied on the models proposed by Sheta and Boehm. The parameters of these models are tuned by EAMD to obtain the optimal result. The parameter tuning is based on the outcome of MMRE. For every generation these values are changed and the updated values are temporarily stored in the memory. At the end of final generation we get the minimum value of MMRE and optimized value of four parameters. The working model of the proposed work is shown in Fig. 2.

Fig.2

(a) Prototype model of cost estimation. (b) Detailed view of the internal working of cost estimation model.

4.4.1 Working model

Working model of the proposed work is divided into three steps which are as follows:

(1) Initialization of Generation:

Population size (PS * D) is initialized randomly having four parameters (a, b, c and d) are initialized randomly within the specified search domain, shown in Table 2. A population of size (PS * D), here D (dimension) actually shows the number of parameters and PS stands for population size. Group of four individual parameters has been taken and further process is applied as shown in Fig. 2.

(2) Intermediate Generations:

From all 18 projects, select one project randomly.

Calculate MRE of this project using all sets of parameters (a, b, c, d).

Select minimum of MRE calculated in step 2.

Find parameter set (a, b, c, d) corresponding to minimum MRE.

Apply parameter set (a, b, c, d) obtained from step 4 on all 18 projects.

Find all 18 MRE from step 5.

Compute average of 18 MRE found in step 6 and assign it as MMRE.

Same process continues for subsequent generations.

(3) n^th Generation: The steps stated above in each generation are continued until the specified number of generations has reached. Finally we get the optimized value of MMRE and the optimum value of corresponding four parameters.

4.4.2 Proposed algorithm

Proposed algorithm generates global optimal values of a, b, c and d and minimum MMRE to minimize the difference between actual effort (measured effort) and estimated effort and applicable to all types of software projects. Steps involved in the proposed algorithm are as follows:

Step 1. Create initial random population Pop of the parameters a, b, c and d within the specified search space domain [0 10], [0.3 2], [–0.5 0.5], and [0 20] respectively. Here, population is a matrix of size PS * D. Here, D is the number of dimension which is equal to the number of parameters and PS is the population size.

Step 2. While (number of generations < MaxGen) do

Note. Repeat the following steps from 3 to 10 until the specified number of generations (MaxGen) has reached. Here, MaxGen is the maximum number of generations taken for the algorithm.

Step 3. For each row vector of parameter values in the matrix do

For i = 1, 2………. PS

// Do the following operations

Calculate EE_i // Equations 8–10

Calculate MRE_i // Equation 3

End for

Step 4. Calculate MMRE_i // Equation 4

Step 5. Store and replace minimum MMRE and corresponding four values of parameters.

Step 6. Apply adaption operator of EAMD algorithm to optimize the value of a, b, c and d.

Step 7. Merge initial population with optimized value of a, b, c and d and sort this population according to their minimum value.

Step 8. Apply selection operator to select N best solution and generate new population.

Step 9. Number of generations ← number of generations + 1.

Step 10. End while

Step 11. Select minimum MMRE and also select the corresponding value of four parameters where the optimum MMRE has obtained.

Step 12. Print the final value of a, b, c, d and MMRE according to step11.

In the proposed algorithm, we have calculated the estimated efforts along with measured (actual) efforts for different projects and they are used to calculate the magnitude of relative error (MRE). We have optimized the value of four parameters in every generation to minimize MMRE. Initially population of four parameters have been generated and as per model type, each row vector of parameters is applied to calculate the MRE of all 18 projects and correspondingly we have calculated MMRE and stored the minimum of MMRE along with their parameters in a memory.

In successive generation each row vector is optimized by EAMD and the whole process is repeated. After the completion of each generation, the current MMRE is compared with the previous generation and it is updated with minimum MMRE along with corresponding parameters. This is continued until the final generation has been reached. Finally the reduced MMRE and related optimized parameters are stored as optimal result.

5 Result analysis

In this section we have depicted the initial population of four parameters and EAMD is applied to optimize these parameters for improving the estimation accuracy of the proposed models. The estimated effort and MRE obtained by the proposed algorithm gives better results than the existing techniques for all three models proposed by Sheta. Estimated effort is very close to the measured effort and MRE is very less as compared to other methods for all models.

We have calculated MMRE, PRED and VAF which are shown in tables given below. The results received by the proposed work are superior to the other methods. For every model distinct optimal value is obtained for the related parameters. When we compare model 2 to model 1, a slightly better estimation has been achieved. The same thing happened during the comparison of model 3 to model 2. The parameters have been tuned enough to provide better estimation rather than other estimationtechniques.

All models proposed by Sheta have been optimized by the proposed algorithm and the parameters are tuned enough using EAMD. The proposed optimized models are as follows: $Model 1 : EE = 1.5889 (DLOC) 0 . 9889$ (11) $\begin{matrix} Model 2 : EE = 0.59378 (DLOC) 1 . 1911 \\ + 0 . 11902 (M) \end{matrix}$ (12) $\begin{matrix} Model 3 : EE = 0 . 4872 (DLOC) 1 . 2312 \\ + 0 . 09885 (M) + 1 . 2247 \end{matrix}$ (13)

Here EE is the estimated effort, DLOC stands for developed line of code and M is the methodology added in the basic COCOMO model to improve the prediction accuracy of the model. In Table 3, we have shown the random initial population for all models to do the experimental work and analysis within the upper and lower limit of the search domain on which the estimation takes place.

Table 3

Initial random population of parameters within the specified limit

Name of the Parameters
Sr. No.	a	b	c	d
1	8.1472	1.6468	0.1948	2.3800
2	9.0579	1.9311	–0.1829	9.9673
3	1.2699	1.4148	0.4502	19.1949
4	9.1338	0.3607	–0.4656	6.8077
5	6.3236	1.7435	–0.0613	11.7054
6	0.9754	1.8878	–0.1184	4.4762
7	2.7850	1.4538	0.2655	15.0253
8	5.4688	1.5882	0.2952	5.1019
9	9.5751	1.5633	–0.3131	10.1191
10	9.6489	0.9668	–0.0102	13.9815
11	1.5761	1.4143	–0.0544	17.8181
12	9.7059	0.5910	0.1463	19.1858
13	9.5717	1.5003	0.2094	10.9443
14	4.8538	0.3541	0.2547	2.7725
15	8.0028	0.7708	–0.2240	2.9859
16	1.4189	0.3785	0.1797	5.1502
17	4.2176	0.4651	0.1551	16.8143
18	9.1574	1.6999	–0.3374	5.0856

5.1 Result analysis for model 1

Table 4 shows the final optimized value of the parameters used in calculating the estimated effort for modal 1. Estimated effort obtained by theproposed method, shown in Table 5, gives better result rather than Sheta and Sharma et al. [32]. Table 6, shows the magnitude of relative error of all methods. MRE obtained by proposed method is very less as compared to other methods. For every project we get the required optimum value due to better convergence rate.

Table 4
Optimized value of parameters by proposed method for model 1

Parameters Optimized value by proposed method

a 1.5889

b 0.9889

Parameters	Optimized value by proposed method
a	1.5889
b	0.9889

Table 5

Comparison of EE for model 1

Project	Measured	Estimated	Estimated	Estimated
No.	Effort	Effort	Effort	Effort
	(ME)	(Sheta model 1)	(Sharma et al.)	(Proposed)
1.	115.8000	131.9154	141.0497	136.3328
2.	96.0000	80.8827	69.6811	70.3495
3.	79.0000	81.2663	70.158	70.80123
4.	90.8000	91.2677	82.9363	82.83599
5.	39.6000	60.5603	45.9145	47.56498
6.	98.4000	106.7196	103.9129	102.3517
7.	18.9000	31.6447	18.01261	19.77045
8	10.3000	27.3785	14.6188	16.25364
9.	28.5000	46.2352	31.115	33.01756
10.	7.0000	11.2212	4.0408	4.864118
11.	9.0000	14.0108	5.5652	6.567919
12.	7.3000	22.0305	10.6867	12.11404
13.	5.0000	8.4406	2.6803	3.309324
14.	8.4000	15.9157	6.6879	7.803834
15.	98.7000	119.285	121.9998	118.9816
16.	15.6000	25.8372	13.4473	15.02848
17.	23.9000	31.1008	17.5679	19.31216
18.	138.3000	143.0788	158.574	152.1664

Table 6

Comparison of MRE for model 1

Project	MRE	MRE	MRE
No.	(Sheta model 1)	(Sharma et al.)	(Proposed)
1.	0.1392	0.2180	0.1773
2.	0.1575	0.2742	0.2671
3.	0.0287	0.1119	0.1038
4.	0.0052	0.0866	0.0877
5.	0.5293	0.1595	0.2011
6.	0.0845	0.056	0.0402
7.	0.6743	0.0469	0.0461
8.	1.6581	0.4193	0.5780
9.	0.6223	0.0918	0.1585
10.	0.6030	0.4227	0.3051
11.	0.5568	0.3816	0.2702
12.	2.0179	0.4639	0.6595
13.	0.6881	0.4639	0.3381
14.	0.8947	0.2038	0.0709
15.	0.2086	0.2361	0.2055
16.	0.6562	0.1379	0.0366
17.	0.3013	0.2649	0.1919
18.	0.0346	0.1466	0.1003

Results of MMRE and PRED of three methods are listed in Table 7. The value obtained by the proposed method is better than other two methods. It also gives better value of VAF rather than others, shown in Table 8.

Table 7

Comparison of MMRE and PRED for model 1

Measurement	Approach Name
Techniques	Sheta Model 1	Sharma et al.	Proposed
MMRE	23.79	23.25	21.32
PRED (0.25)	38.89	61.11	66.67

Table 8

Comparison of VAF for model 1

Method/	Input of	Output of	VAF (%)
Approach	method	method
Proposed	KDLOC	Effort	97.40
Model 1 (Sheta)	KDLOC	Effort	96.31
Sharma et al.	KDLOC	Effort	96.63

5.2 Result analysis for model 2

Table 9 shows the final optimized value of parameters for modal 2. Estimated effort is shown in Table 10 by different techniques and it is found that the proposed method shows better result rather than Sheta and Sharma et al. MRE obtained by different methods is shown in Table 11, which shows the superiority of the proposed method over other methods. Project wise estimation of the proposed method is very close to the measured effort and it gives better results. In Table 12, MMRE and PRED results obtained by the proposed method is superior to the Sheta and Sharma et al. methods. The VAF value estimated by the proposed technique comparable to other models presented in Table 13.

Table 9
Optimized value of parameters by proposed method for model 2

Parameters Optimized value by

proposed method

a 0.59378

b 1.911

c 0.11902

Parameters	Optimized value by
a	0.59378
b	1.911
c	0.11902

Table 10

Comparison of EE for model 2

Project	Measured	Estimated	Estimated	Estimated
No.	Effort	Effort	Effort	Effort
	(ME)	(Sheta model 2)	(Sharma et al.)	(Proposed)
1.	115.8000	124.8585	123.2686	130.1818
2.	96.0000	74.8467	56.6512	59.4468
3.	79.0000	75.4852	56.9502	59.7695
4.	90.8000	85.4349	68.3607	71.8584
5.	39.6000	50.5815	38.1364	39.7823
6.	98.4000	99.0504	88.4129	93.0931
7.	18.9000	24.1480	14.9756	15.4660
8	10.3000	18.0105	13.4343	13.8182
9.	28.5000	37.2724	25.6393	26.6348
10.	7.0000	4.5849	5.2962	5.3795
11.	9.0000	8.9384	5.432	5.5423
12.	7.3000	13.5926	10.2889	10.5476
13.	5.0000	1.5100	4.7104	4.7695
14.	8.4000	8.2544	7.3423	7.4896
15.	98.7000	110.5249	105.8789	111.6298
16.	15.6000	18.2559	11.7656	12.1049
17.	23.9000	23.3690	14.7645	15.2404
18.	138.3000	135.4825	140.545	148.5731

Table 11

Comparison of MRE for model 2

Project	MRE	MRE	MRE
No.	(Sheta model 2)	(Sharma et al.)	(Proposed)
1.	0.0782	0.0645	0.1242
2.	0.2203	0.4099	0.3807
3.	0.0445	0.2791	0.2434
4.	0.0591	0.2471	0.2086
5.	0.2773	0.0369	0.0046
6.	0.0066	0.1015	0.0539
7.	0.2777	0.2076	0.1816
8.	0.7486	0.3043	0.3415
9.	0.3078	0.1004	0.0653
10.	0.3450	0.2434	0.2315
11.	0.0068	0.3964	0.3842
12.	0.8620	0.4094	0.4448
13.	0.6980	0.0579	0.0461
14.	0.0173	0.1259	0.1084
15.	0.1198	0.0727	0.1310
16.	0.1703	0.2458	0.2240
17.	0.0222	0.3822	0.3623
18.	0.0204	0.0162	0.0743

Table 12

Comparison of MMRE and PRED for model 2

Measurement Techniques	Approach Name
	Sheta	Sharma et al.	Proposed
MMRE (%)	23.79	20.56	20.06
PRED (25%)	61.11	66.67	72.22

Table 13

Comparison of VAF for model 2

Method/Approach	Input of method	Output of method	VAF (%)
Proposed	KDLOC and ME	Effort	97.57
Model 2 (Sheta)	KDLOC and ME	Effort	96.85
Sharma et al.	KDLOC and ME	Effort	96.26

5.3 Result analysis for model 3

Table 14 shows the optimized value of all four parameters of modal 3. In Table 15, the estimated effort calculated by proposed method is compared with other estimated effort proposed by Sheta, Sharma et al. and Brajesh et al. [33]. It is found that the estimated effort provided by the proposed method is very close to the measured (actual) effort and it also provides better estimated value than the others.

Table 14
Optimized value of parameters by proposed method for model 3

Parameters Optimized value by

proposed method

a 0.4872

b 1.2312

c 0.09885

d 1.2247

Parameters	Optimized value by
a	0.4872
b	1.2312
c	0.09885
d	1.2247

Table 15

Comparison of EE for model 3

Project	Measured	Estimated	Estimated	Estimated	Estimated
No.	Effort	Effort	Effort	Effort	Effort
	(ME)	(Sheta	(Sharma	(Brajesh	(Proposed)
		model 3)	et al.)	et al.)
1.	115.8000	134.0202	135.1201	113.25	131.2047
2.	96.0000	84.1616	60.5491	68.13	58.9462
3.	79.0000	85.0112	60.9362	68.89	59.2932
4.	90.8000	94.9828	73.5736	78.01	71.5167
5.	39.6000	56.6580	39.397	42.98	38.9363
6.	98.4000	107.2609	95.5657	89.24	92.9854
7.	18.9000	32.6461	15.1244	19.96	15.2902
8	10.3000	25.0755	13.1467	12.92	13.5978
9.	28.5000	44.3086	26.135	31.28	26.0385
10.	7.0000	14.4563	5.2912	1.93	5.8179
11.	9.0000	19.9759	5.7102	7.13	6.0337
12.	7.3000	21.5763	10.062	9.36	10.5456
13.	5.0000	11.2703	4.6484	1.22	5.2530
14.	8.4000	17.0887	7.1826	4.79	7.7189
15.	98.7000	118.0378	114.9918	98.98	111.9007
16.	15.6000	26.8312	11.7552	14.34	12.0700
17.	23.9000	31.6864	14.8567	19.07	15.0588
18.	138.3000	144.4587	154.6962	122.55	150.2168

Table 16 shows that the proposed method also provides better MRE than the other existing methods. Table 17 present MMRE and PRED which is obtained by all methods during the experimental work and results show the superiority of the proposed method. In all kind of measurements, proposed method gives better results. The value of VAF shown by all techniques in Table 18 and EAMD shows good result than the other techniques.

Table 16

Comparison of MRE for model 3

Project No.	MRE (Sheta model 3)	MRE (Sharma et al.)	MRE (Brajesh et al.)	MRE (Proposed)
1.	0.1573	0.1668	0.0220	0.1331
2.	0.1233	0.3693	0.2903	0.3859
3.	0.0761	0.2287	0.1279	0.2495
4.	0.0461	0.1897	0.1409	0.2124
5.	0.4308	0.0051	0.0854	0.0168
6.	0.0901	0.0288	0.0931	0.0550
7.	0.7273	0.1998	0.0561	0.1909
8.	1.4345	0.2764	0.2544	0.3202
9.	0.5547	0.0829	0.0975	0.0864
10.	1.0652	0.2441	0.7243	0.1689
11.	1.2195	0.3655	0.2078	0.3296
12.	1.9557	0.3784	0.2822	0.4446
13.	1.2541	0.0703	0.7560	0.0506
14.	1.0344	0.1449	0.4298	0.0811
15.	0.1959	0.1651	0.0028	0.1337
16.	0.7199	0.2465	0.0808	0.2263
17.	0.3258	0.3784	0.2021	0.3699
18.	0.0445	0.1186	0.1139	0.0862

Table 17

Comparison of MMRE and PRED for model 3

Measurement Techniques	Sheta Model 3	Sharma et al.	Brajesh et al.	Proposed Method
MMRE (%)	63.64	20.33	22.00	19.67
PRED (25%)	38.89	72.22	66.67	72.22

Table 18

Comparison of VAF for model 3

Method/Approach	Input of method	Output of method	VAF (%)
Proposed	KDLOC and ME	Effort	97.92
Model 3 (Sheta)	KDLOC and ME	Effort	97.57
Sharma et al.	KDLOC and ME	Effort	96.19
Brajesh et al.	KDLOC and ME	Effort	97.91

5.4 Comparison with Boehm’s models

We have also compared our proposed algorithms to the different classification of the software projects categorized as Organic, Semidetached and Embedded based on the Boehm’s COCOMO model. Comparison of the results with our proposed algorithm for the basic COCOMO model is discussed below. In Table 19, optimized values of the parameters a and b are shown which is used to obtain the results of estimated effort, MRE, MMRE and PRED. These results are shown in Tables 20–22. We can see the better performance of our proposed algorithms in comparison to the Boehm’s models.

Table 19
Optimized value of the parameters for the Boehm’s model and proposed model

Parameters Organic Semidetached Embedded Proposed

Method

a 3.2 3.0 2.8 1.3889

b 1.05 1.15 1.2 0.9889

Parameters	Organic	Semidetached	Embedded	Proposed
a	3.2	3.0	2.8	1.3889
b	1.05	1.15	1.2	0.9889

Table 20

Comparison of EE for the Boehm’s model and proposed model

Organic	Semidetached	Embedded	Proposed
			Method
361.5071	531.6291	621.4495	119.172
179.0705	246.2972	278.4379	66.1751
180.2916	248.1374	280.609	66.6005
212.9935	297.836	339.4954	77.9335
118.1812	156.2412	173.1691	44.7249
266.6361	380.9084	438.8565	96.3149
46.5286	56.2876	59.6774	18.5735
37.7919	44.8218	47.0528	15.2666
80.2066	102.194	111.1947	31.0346
10.4974	11.0201	10.8841	4.56315
14.4398	15.6264	15.6696	6.6134
27.6598	31.8442	32.9361	11.375
6.974	7.0416	6.8206	3.10335
17.3408	19.0958	19.3162	7.32447
312.8554	453.7924	526.8234	111.981
34.7745	40.9175	42.7843	14.1147
45.3843	54.7732	58.0029	18.1425
406.2408	604.0889	710.0855	143.249

Table 21

Comparison of MRE for the Boehm’s model and proposed model

Project	Organic	Semidetached	Embedded	MRE
No.				(Proposed)
1.	2.1218	3.5909	4.3666	0.0291
2.	0.8653	1.5656	1.9004	0.3107
3.	1.2822	2.1410	2.5520	0.1569
4.	1.3457	2.2801	2.7389	0.1417
5.	1.9844	2.9455	3.3730	0.1294
6.	1.7097	2.8710	3.4599	0.0212
7.	1.4618	1.9782	2.1575	0.0173
8.	2.6691	3.3516	3.5682	0.4822
9.	1.8143	2.5858	2.9016	0.0889
10.	0.4996	0.5743	0.5549	0.3481
11.	0.6044	0.7363	0.7411	0.3152
12.	2.7890	3.3622	3.5118	0.5582
13.	0.3948	0.4083	0.3641	0.3793
14.	1.0644	1.2733	1.2995	0.1280
15.	2.1698	3.5977	4.3376	0.1346
16.	1.2291	1.6229	1.7426	0.0952
17.	0.8989	1.2918	1.4269	0.2409
18.	1.9374	3.3680	4.1344	0.0358

Table 22

Comparison of MMRE and PRED for the Boehm’s model and proposed model

Measurement	Organic	Semidetached	Embedded	Proposed
Techniques				Method
MMRE (%)	149.12	219.12	250.73	20.07
PRED (25%)	0	0	0	72.22

6 Conclusion and future work

Proposed work has been used to estimate the software cost with intent to minimize the difference between the estimated (predicted) and measured (actual) cost/effort of different projects. For this work, parameters a, b, c and d of Alaa F. Sheta models is tuned by EAMD. To check the closeness between predicted result and actual result, MMRE is taken as fitness function. From the results it is clear that values obtained by proposed method is more accurate as compared to other models like Sheta, Sharma et al. and Brajesh et al. A percentage reduction has been seen in MMRE obtained by proposed algorithm i.e. for model 1, 10.38% from Sheta and 8.30% from Sharma et al., for model 2, 15.68% from Sheta and 2.42% from Sharma et al. and for model 3, 69.09% from Sheta, 3.25% from Sharma et al. and 10.59% from Brajesh et al.

The minimum value of MMRE obtained by the proposed method (using EAMD) is 19.67% which is very less as compared to other proposed techniques. Best value of percentage of prediction (PRED (L)) at L = 25% is 72.22 and VAF is 97.92% which is better than the other techniques. It is also found that proposed method gives better results than the basic COCOMO model for Organic, Semidetached and Embedded software projects shown in Tables 19, 20, 21 and 22. It is found that the proposed algorithm is very effective in predicting the overall cost of any project with minimum MMRE, high percentage of PRED and VAF.

In future we will introduce some new optimization techniques to improve the performance of different cost estimation models and provide better estimation with minimal error rate.

References

Boehm

B.W.

Software engineering economics. Vol.197. Englewood Cliffs (NJ): Prentice-hall, 1981.

Dolado

J.J.

, On the problem of the software cost function, Information and Software Technology43(1) (2001), 61–72.

Estell

R.G.

, Software life cycle management, ACMSIGCSIM Installation Management Review5(2) (1976), 2–15.

http://en.wikipedia.org/wiki/COCOMO .

Kemerer

C.F.

, An empirical validation of software cost estimation models, Communications of the ACM30(5) (1987), 416–429.

Shepperd

and Schofield

, Estimating software project effort using analogies, Software Engineering, IEEE Transactions on23(11) (1997), 736–743.

Idri

, Abran

and Kjiri

, COCOMO cost model using fuzzy logic, 7th International Conference on Fuzzy Theory & Techniques27 (2000).

Burgess

C.J.

and Lefley

, Can genetic programming improve software effort estimation? A comparative evaluation, Information and Software Technology43(14) (2001), 863–873.

Sheta

A.F.

, Estimation of the COCOMO model parameters using genetic algorithms for NASA software projects, Journal of Computer Science2(2) (2006), 118.

10.

Tripathi

., Environmental adaption method for dynamic environment,. , Systems, Man and Cybernetics (SMC), 2014 IEEE International Conference on.IEEE2014.

11.

Pedrycz

, Peters

J.F.

and Ramanna

, A fuzzy set approach to cost estimation of software projects, Electrical and Computer Engineering, 1999 IEEE Canadian Conference on. Vol. 2. IEEE1999.

12.

Alaa

F.S.

and Al-Afeef

, A GP effort estimation model utilizing line of code and methodology for NASA software projects, Intelligent Systems Design and Applications (ISDA), 2010 10th International Conference on, IEEE2010.

13.

Papatheocharous

, Papadopoulos

and Andreou

A.S.

, Software effort estimation with ridge regression and evolutionary attribute selection, arXiv preprintarXiv: 1012.57542010.

14.

Børte

and Nerland

, Software effort estimation as collective accomplishment, Scandinavian Journal of Information Systems22(2) (2010), 65–98.

15.

Basha

and Ponnurangam

, Analysis ofempirical software effort estimation models, arXiv preprint arXiv:1004.1239 (2010).

16.

Attarzadeh

and Ow

S.H.

, Software development effort estimation based on a new fuzzy logic model, International Journal of Computer Theory and Engineering1(4) (2009), 1793–8201.

17.

Kumar

, Ananda Krishna

and Satsangi

P.S.

, Fuzzy systems and neural networks in software engineering project management, Applied Intelligence4(1) (1994), 31–52.

18.

Idri

., Software cost estimation models using radial basis function neural networks, Software Process and Product Measurement. SpringerBerlin Heidelberg, 2008,21–31.

19.

Boehm

B.W.

., A software development environment for improving productivity, Computer (1984).

20.

Boehm

, Abts

and Chulani

, Software development cost estimation approaches—A survey, , Annals of Software Engineering10(1–4) (2000), 177–205.

21.

Putnam

L.H.

and Myers

, Measures for excellence: Reliable software on time, within budget, Prentice Hall Professional Technical Reference (1991).

22.

Hamid

H.-A.

, Malhotra

and Quirk

, Estimating software productivity and cost for NASA projects, Journal of Parametrics11(1) (1991), 59–71.

23.

Bednar

J.A.

, Robertson

Estimating Size and Effort, University of Edinburgh, Old College South Bridge, Edinburgh EH8 9YL.

24.

Tripathi

., An Environmental Adaption Method with real parameter encoding for dynamic environment, Journal of Intelligent & Fuzzy Systems Preprint1–13.

25.

Mishra

K.K.

, Tiwari

and Misra

A.K.

, A bio inspired algorithm for solving optimization problems, Computer and Communication Technology (ICCCT), 2011 2nd International Conference on. IEEE2011.

26.

Mishra

K.K.

, Tiwari

and Misra

A.K.

, Improved environmental adaption method and its application in test case generation, Journal of Intelligent and Fuzzy Systems27(5) (2014), 2305–2317.

27.

Bailey

J.W.

and Basili

V.R.

, A meta-model for software development resource expenditures, Proceedings of the 5th international conference on Software engineering. IEEE Press1981.

28.

Mansour

and Salame

, Data generation for path testing, Software Quality Journal (2004).

29.

Y.-F.

, Xie

and Goh

T.N.

, A study of project selection and feature weighting for analogy based software cost estimation, Journal of Systems and Software82(2) (2009), 241–252.

30.

Foss

., A simulation study of the model evaluation criterion MMRE, Software Engineering, IEEE Transactions on29(11) (2003), 985–995.

31.

Kemerer

C.F.

, An empirical validation of software cost estimation models, Communications of the ACM30(5) (1987), 416–429.

32.

Sharma

, Sinhal

and Verma

, Software assessment parameter optimization using genetic algorithm, International Journal of Computer Applications72(7) (2013), 8–13.

33.

Singh

B.K.

., Effect of variations in measurement process for software development efforts, Multimedia, Computer Graphics and Broadcasting (MulGraB), 2014 6th International Conference on. IEEE2014.

Improved software cost estimation models: A new perspective based on evolution in Dynamic Environment

Abstract

Keywords

1 Introduction

2 Related work

3 Background details

3.1 Constructive Cost Estimation Model (COCOMO)

3.3 Environmental Adaption Method for Dynamic Environment (EAMD)

3.4.1 Objective

3.4.2 Characteristics of a good software cost estimation technique

3.4.3 Challenges

3.4.4 Solving approach

4 Proposed work

4.1 Fitness function

4.2 Evaluation technique

4.4.2 Proposed algorithm

5 Result analysis

Table 4 Optimized value of parameters by proposed method for model 1 Parameters Optimized value by proposed method a 1.5889 b 0.9889

Table 9 Optimized value of parameters by proposed method for model 2 Parameters Optimized value by proposed method a 0.59378 b 1.911 c 0.11902

Table 14 Optimized value of parameters by proposed method for model 3 Parameters Optimized value by proposed method a 0.4872 b 1.2312 c 0.09885 d 1.2247

Table 19 Optimized value of the parameters for the Boehm’s model and proposed model Parameters Organic Semidetached Embedded Proposed Method a 3.2 3.0 2.8 1.3889 b 1.05 1.15 1.2 0.9889

References

Table 4
Optimized value of parameters by proposed method for model 1

Parameters Optimized value by proposed method

a 1.5889

b 0.9889

Table 9
Optimized value of parameters by proposed method for model 2

Parameters Optimized value by

proposed method

a 0.59378

b 1.911

c 0.11902

Table 14
Optimized value of parameters by proposed method for model 3

Parameters Optimized value by

proposed method

a 0.4872

b 1.2312

c 0.09885

d 1.2247

Table 19
Optimized value of the parameters for the Boehm’s model and proposed model

Parameters Organic Semidetached Embedded Proposed

Method

a 3.2 3.0 2.8 1.3889

b 1.05 1.15 1.2 0.9889