Abstract
Considering object oriented program based software metrics (cohesion, coupling and complexity) and their significance to characterize software quality, particularly software component reusability, we have considered six important CK matrices. The predominant reason behind using the measurement technique is the individual relationship with the design aspect and fault-proneness or aging-proneness. The key objective of this paper is to generate employment opening to thousands of people who have different skillsets and furthermore to provide hassle-free services by RozGaar service providers to customers with the help of machine learning techniques. In the current century’s rapid growth of modernization and automation, manual labor is reduced which gives rise to unemployment at mass. If we need technicians, workers, plumbers or drivers who work on daily wages, it is quite difficult to find one in our locality without having any contact references and knowing the quality of the work they provide. This paper helps in filling the gap between the various customers and the service providers. We aim to introduce this paper as an ocean of opportunities for all where people can get jobs on a daily basis and can earn money for their skills. The used application is a dual-platform application that runs on Android devices and on Internet as a website, promising you to provide unmatched services of daily work. To achieve the goal, we used the novel software prediction model, evolutionary algorithms such as decision tree, Rough Set, and Logistic Regression algorithms, to predict software reusability.
Keywords
Introduction
In the 21
Literature review
With the rapid growth of apps, the revenue increased over 23 billion dollars by the end of 2016. A large amounts of developers either worked with Android or iOS [1]. As mobile applications grow in popularity, its complexity increases simultaneously [2, 3, 4]. A large amount of developers believes that Android is the most important platform for handling different tasks smoothly. In the literature, nearly 85% of the users use the Android apps platform [2]. It is difficult to understand the source code of Android apps [2]. Yuan Tian et al. [4] have examined 28 issues that have the length of eight dimensions to recognize how the highly rated apps are dissimilar from the low-rated apps. They also influenced factors by applying the random forest classifier to identify the high rated apps. Similarly, Padhy et al. [5] explained how mobile apps developed using object oriented programming language such as C# and Java. They demonstrated how to estimate the software metrics from apps as well and how to automatically estimate the metrics. Their prime focus was to estimate metrics using CK (Chidamber and Kemerer).
The term mobile apps derives from mobile applications, which are able to run on smartphones and tablets [6]. A large amount of users reuse the existing code and develop new android apps. The thought of inheritance is termed reusability. The software metrics are estimated from android app codes. The metrics, such as WMC (Weighted Methods per Class), DIT (Depth of Inheritance), CBO (Coupling between Object), NOC (Number of Children), and RFC (Response per Class) are called CK-Metrics [7]. Developers use the concept of software quality to improve the functionality of apps to attract customers. The following properties might be used reliability, efficiency, maintainability, reusability, and quality. At every stage, the developers must think about the functionality of the software’s development life cycle (SDLC). Metrics are maticulsoly checked and verified at each stage of the SDLC process. Some metrics may be used during the coding whereas others may be used in the process of the development stage. Some typical metrics are used only when the project is completed but in some situations metrics may be used at earlier stages [7]. Numerous studies have researched the metrics and developed prototype models which are able to find quality attributes such as maintainability [8], reusability [9], usability of the component [10], and reliability of the component based architecture [11]. The benefits of the intensified reuse are: lower projects costs, less staff, less time, and consequently improves the reliability of the code [12]. Once a reusable component has been developed, it can reused in a new product. The newly developed product must certainly have a lower fault density [13].
Software qualities such as high cohesion and coupling, modularity, and low complexity are the most important factors influencing the software reusability prediction [14]. Most of the apps use Java programming language with the help of the Android Software Development Kit. Padhy et al. [20] described the properties of software metrics and how to estimate the metrics from software code. They have presented the estimation technique to measure from the C++ and C# code. CK-Metrics are widely used in the field of software engineering; not only for prediction but also for cost estimation. With the help of OOM (Object Oriented Metrics) reusability assets can be measured efficiently. Padhy et al. [21] and Tian et al. [22] presented a paper about rating apps. They focused on how highly rated apps are different from low-rated apps.
Considering the significance of Chidamber and Kemmerer Object-Oriented Software metric (i.e. CK-Metrics), in our research we intended to exploit associated features to characterize reuse-proneness of a software component in WoS software. Undeniably, realizing the fact that the excessive component reusing might give rise to complexity, lack of cohesion eventually leads to aging proneness of fault proneness. In this case, assessing these features (i.e. complexity, cohesion and coupling) might play a vital role in identifying reuse-proneness of a software component in software. Considering OOP based software metrics (cohesion, coupling and complexity) and their significance to characterize software quality, particularly software component reusability, we have well thought-out key components of CK metrics (i.e. WMC, CBO, DIT, LCOM, NOC, and RFC). The predominant reason behind using these software metrics is their individual relationship with the design aspect and fault-proneness or aging-proneness. For example, an increase in the link of code (LOC) usually increases the complexity and execution time. On the other hand, coupling features that state how well a class is connected to the other might cause adversaries in case of improper coupling between (or among) classes or functions. In any software function response, instructions often play a decisive role in assuring a reliable function. In this relation, Response for a Class (RFC) metrics must reflect the allied classes or functions’ responses. In this relation, a function in denial model for any request shows aging behaviour. Thus, assessing RFC of software can be efficient in assuring responsiveness of the classes. Undeniably, cohesion between or amongst classes signifies uniformity of the artefacts in software. Under such circumstances, assessing software metrics such as Lack of Cohesion in Methods (LCOM) can be significant to characterize aging-proneness of a software component. Similarly, Depth of Inheritance Tree (DIT) too signifies the maximum suitable length from the root to the node in the tree signifies reliability of the function and coherence. Therefore, the assessment of the selected software metrics (i.e. WMC, CBO, DIT, LCOM, NOC, and RFC) can assist in characterizing a class or software component for its reuse-proneness.
Due to the above mentioned reasons we have considered these features for reuse-proneness estimation or reusability estimation. Khan and Mahmood [23] described the complexity of the project and the method of calculation of shift and value shift in his work. ArunKumar and Dillibabu [24] developed “a model which enhances the software quality without increasing cost, effort and time”. Padhy et al. [25] discussed the software reusability metrics and its proposed model, algorithms and optimization techniques.
Considering suitability of CK metrics for software quality assessment (fault proneness, maintainability, reliability, reusability, scalability, etc.), we have applied CK metrics in this research. For the sake of simplicity, we have applied Chidamber and Kemerer Java Machine (CKJM) tool to extract the software features or metrics values. Since there are 100 software projects in our model (all are developed in Java language with OOP concept), this software is given as input to the CKJM tool and accordingly the respective 22 CK metrics were obtained, out of which six metrics (i.e. WMC, CBO, DIT, LCOM, NOC, and RFC) were selected as input for further process. We have followed the instructions proposed in major online resources such as
The key objective behind this RozGaar mobile application and website is to generate employment opportunities who have different skillsets and to provide hassle-free services to customers.
RozGaar will serve customers to greater extents by creating job opportunities as well as avail any daily basis services provided by RozGaar in their locality. Our main aim is to focus on the rural population and act as a link between them and the customers who need helping hands to get any task done without any hustle. Besides all these, certain features can be added to RozGaar, such as: organizing coaching classes through video tutorials and help the registered members to enhance their skills for better livelihood and better future scope. We implemented evaluation and performance ratings so that we could analyze the total efficiency of the RozGaar members who are responsible for providing services to the customers.
With help of MobileApps, job seekers can view the jobs for upcoming days, from which both parties can benefit. Job prediction is a difficult task nowadays, but our MobileApps can forecast the upcoming jobs by using state-of-the-art of machine learning algorithms.
Research goals
The problem statements pointed out in this paper are addressed below.
These problems are addressed below.
This paper provides the novel techniques of ML (machine learning) algorithms which are regularly used for mobile technology. We have simulated the algorithms so that the MobileApps is able to predict the jobs for the job seekers. There are different types of agents available but properly not defined so far. According to Wooldridge [129], an agent can be defined as “a computer system that is situated in some environment, and that is capable of Autonomous action in this environment in order to meet its design objectives.â’ There are 3 types of machine learning techniques that are used: supervised (1), unsupervised (2) and reinforcement learning (3). In category 1, the beginner accepts an example input value along with their results and is supposed to study the contribution to the accurate outputs. In category 2, the expected output is not known and thus the beginner has to provide the input. In category 3, the beginner is required to achieve a certain goal in a dynamic environment. In précis there is a clear objective, as the environment is dynamic and at the same time there is an assessment of the results, the conclusions aren’t explicitly affirmed. It is completely accomplished that reinforcement learning desires to be used to study the actions of a user. Compared to the other two ML approaches, the reinforcement learning approach is most suitable and this is thus the one we have chosen in this paper. The below algorithm provides the basic information such as the state, time and reward. The different states are mentioned as ‘S’ and time of interval is represented by ‘T’ and ‘R” for reward as well as to measure the quality and value we have represented as ‘Q’/’V’.
The three different platforms mentioned before all have their own perks. For this investigation it is predominantly essential that at least some of the features of the phone can be checked, apps can be begin from another app and the object oriented language Java is supported. There are numerous platforms available for these kind of applications. Here we have taken Android as the platform, for which Eclipse is the most likely choice. This is because this environment provides the tools and strongly supports the Android development environment. Android fully supports the object oriented paradigm. These are the following platforms we have used for mobile application: Windows Phone, Android, iOS, because it is easy to set up these systems.
The competence and usefulness of forecasting model depends on the class of the software measurement data. Figure 1 indicates the proposed work. The flow diagram predicts the jobs from the apps by using the novel evolutionary algorithms. First, the class has to be identified as well as the UML diagram from the Mobile App (RozGar), which is a challenging task. Once the class information is identified, the metrics are obtained. Further source code measurements are validated with the help of the suitable proposed model (outlined below). These selected metrics are fed into the ANN model.
Effort measurement from Android Apps by using ML techniques.
The above flowchart represents the effort measurements from Android application (RojGar) by using the different ML (machine learning) techniques by taking the dataset into consideration. The dataset will be input to the model.
The statistical investigation of the dataset has been carried out. It is confirmed as to whether the dataset track has a normal distribution or is not based on the values of skewness and kurtosis. If normalization is done, the transformation can be done directly so as to get the equal distribution.
Software prediction from the decade (2009–2018)
If normal distribution is not done properly, we need to use the novel concept of logarithmic transformation technique and apply it into the dataset to get the dataset normalized. Some other techniques we have adopted to represent the graph by using the histogram technique to validate the distribution of data earlier and later transformation.
In this step we have measured the values for the input vectors independently, which are contained within the range {0 and 1}. We can represent the scalling the data set as below:
Problem solving approach in RozGaar.
Framework of RozGaar App.
Where
The complete dataset is divided into two different sets called as preparation and examination set. The preparation set is employed for model assessment, though the examination set is employed just for evaluating the predictable attempt of the finishing model. Ten-fold cross-validation process is used to predict the job.
In this step the effort value is forecasted by using the different novel machine learning algorithms such as MARS, CART, decision tree induction, NB (Naïve Bays Classification), KNN, etc.
In this step the performance measurement is done through the different ML (machine learning) algorithms. The different models have been used to carry out the task: Root Mean Square Error, Prediction Accuracy, The Mean of Magnitude of Error Relative to the estimate (MMER), The Mean Magnitude of Relative Error (MMRE), The Mean Absolute Error (MAE). These are the main parameter during performance measurement.
The following task is to resolve the proposed model and flow chart. The key contributions are highlighted in the below section.
Preference diagram.
Apart from the research goal and challenging tasks, the following problems are also discussed throughout this paper: How to recognize the class and UML diagram; How to identify the reusability metrics; How to predict the software reusability, and how to estimate the reusability prediction through mobile apps.
During the systematic survey of prediction techniques, different methods are examined and relationships are derived between the OOM (Object Oriented Metrics) and fault proneness as pointed out in the tabular form. From Table 1 it can be observed that various methods such as logistic regression, decision tree analysis and Naïve Bayes classifier, are commonly used by researchers.
Proposed frame work of RozGaar mobile application (App)
The below framework has been potentially used in mobile apps. The entire framework consists of 3 steps which are described in Fig. 3.
Flow diagram of main class (preferences checking)
Figure 4 depicts the state checking of the application using data accumulate in preferences. “Preferences” get the constant PREFS which stores the value for LOG_PREF and IS_LOGGED_IN, which are responsible for initialization of tables and check whether a user is logged in or not. If LOG_PREF value is false then it will call the initialization () to initialize the table only for the instance time whenever the application runs on a device for once. Similarly, IS_LOGGEG_IN verifies whether the user is logged in or not. If logged in, the application will open City_selection page. If not, it will open Client_Login_portal.
Algorithm for preference checking and initializing tables [Check login preferences]
(1) Let DB, LOG_STATUS, INITIAL_STATUS, USER, PASS, PREF
(2) Set DB := OpenorCreateDb(“RozGaar”)
(3) Set LOG_STATUS := false
(4) Set INITIAL_STATUS := false
(5) Set PREF := getPreferences()
(6) If PREF.getBoolean(LOG_PREF) = false, then
6.1 Initialization(DB, INITIAL_STATUS, PREF)
6.3 Else if PREF.getBoolean(IS_LOGGED_IN) = false, then
(7) NextActivity(Src.class) [City_selection class]
[End of IF Step 7.1]
[End of IF Step 6]
[Initialization function]
Initialization (DB, INITIAL_STATUS, PREF)
(1) Set DB := OpenorCreateDb(“RozGaar”)
(2) Set DB := createTable(“users”)
(3) Set DB := initializeTable(“users”)
(4) Set DB := createTable(“emps”)
(5) Set DB := initialize Table(“emps”)
(6) Set INITIAL_STATUS := true
(7) Set PREF := edit(LOG_PREG,
INITIAL_STATUS)
Pseudo code customer grievance
Generally the mobile app is unable to provide the information about the product and it is thus unknown whether the user is satisfied or not. For this reason we have developed the Pseudo code for customers, as satisfying the client is the main goal of the developer. This automated tool provides enables the user to review the product. The tool reviews the suggestion and grievances. Here we are taking the utmost care of customers to request and response.
Set all the grievance is equal to empty.Grievance assemble everyone’s the complaint and suggestions Read the entire grievance with different types. Loop: Repeat for each suggestion: Inspect all the text in the suggestion if the suggestion exactly like same type then: Point out the suggestion with specific type. Else: Insert then a new suggestion and include it. do again stepladder with the fresh grievance
The reusability stage can be estimated by using the OO-CK metrics (DIT). These metrics are used to calculate the deepness in the inheritance tree in the class diagram. Sometimes it is also called a nesting level hierarchy. These metrics can estimate the class diagram of any object oriented program. The maximum length can be obtained from the start of the class level to the ancestor level in the class hierarchy. The new formula is derived to measure the reusability in the class. As we know, the thought of inheritance is the reusability. Due to inheritance, the number of overridden methods and object reference in a class point out the reusability. Hence, heritage is the straight pointer of reuse in the class itself.
By using the least square regression analysis, we need to calculate the
Where
import java.util.regex.*; import java.util.*; import java.awt.event.*; Class test{ /* array:\\w*\\s*(\\s*\\[\\s*\\]\\s*)*\\s*\\w*\\s*(\\s*\\[\\s*\\]\\s*)* arguments:(\\w*\\s*(\\s*\\[\\s*\\]\\s*)*\\s*\\w*\\s*(\\s*\\[\\s*\\]\\s*)*,?)* cons:"\\s*((public|private|protected)?\\s+)?\\w+\\s*\\({1}?\\s*(\\w*\\ s*(\\[?\\s*\\]?)*{2}?\\s*,?\\s*)+\\s*\\){1}?\\s*\\{?\\s*}?" class:((public)?\\s+)?(class)\\s+\\w*\\s*((extends)\\s+\\w*\\s*)? (\\s+(implements)?\\s+(\\w*\\s*,?\\s*)*)?\\{?\\s*\\}? class name extraction: Pattern pattern=Pattern.compile("class (.*?) \\s*"); Matcher matcher=pattern.matcher(want); if(matcher.find()) { System.out.println (matcher.group(1)); } */ //[^\\(*\\)*]) public static void main(String args[]){ String want=""; String cons="\\s*((public|private|protected)?\\s+)?\\w+\\s*\\({1}?\\"+ "s*(\\w*\\s* (\\[?\\s*\\]?)*{2}?\\s*,?\\s*)+\\s*\\){1}?\\s*\\{?\\s*}?"; while(!want.equals("exit")) { Scanner sc=new Scanner(System.in); want=sc.nextLine(); boolean b=Pattern.matches(cons,want); System.out.println(b); } } } class exp extends test implements ActionListener{ public void actionPerformed(ActionEvent ae) { } }
Once we achieve the reusability, we can measure the complexity level. Complexity can be measured by using a traditional or object oriented approach. Some of the complexity measurement techniques are Halsted Measures and Cyclomatic Complexity. It depends on the number of operands used in the source code. The term operands stands for the number of identifiers and the constants as well as the number of operators, keywords, etc. The automated tool calculates the complexity by using the standard level by taking some of the parameters. It calculates the number of iterations present in the program, the total number of keywords, etc.
Complexity of the metrics and identifying the risk level
Reusability influencing
//single variable or array declaration: //\\w*\\s*(\\s*\\[\\s*\\]\\s*)*\\s*\\w*\\s*(\\s*\\[\\s*\\]\\s*)* //multi-variable declaration or arguments: (\\w*\\s*(\\s*\\[\\s*\\]\\s*)*\\s*\\w*\\s*(\\s*\\[\\s*\\]\\s*)*,?)* //constructor: p*\\w*\\s*\\w+\\s*\\({1}?\\s*(\\w+\\s*(\\s*\\[\\s*\\]\\s*)* \\s*\\w+\\s*(\\s*\\[\\s*\\]\\s*)*,?)*\\s*\\){1}?\\s*\\{? /* 1.methods counter 2.abstract method counter 3.constructor counter 4.class variable counter 5.Object variables counter 6.Inherited Attributes */ import java.io.*; import java.util.*; import java.util.regex.*; public class counter{ int methods=0; int abs_methods=0; int cons=0; int v_counter=0; int ov_counter=0; counter() throws Exception { FileReader fr=new FileReader("C:\\Users\\SURAJ\\Videos\\dfr\\FileScanner.java"); BufferedReader br=new BufferedReader(fr); String s; while((s=br.readLine())!=null) { } } public static void main(String args[]) throws Exception{ boolean b = Pattern.matches("\\w b", "adafvb"); System.out.println(b); } }
In the literature survey we found some important aspects about the metrics (Table 3). In Table 3, the reusability influences are listed.
In this section the parser finds the metrics from the mobile application (RozGaar). Each time the parser scans the apps, the object oriented metrics (i.e. software metrics) are estimated. The parser is a search machine that determines the software metrics, by which we can predict the software reusability by using novel machine learning techniques. Below is a sample of the parser; how it can scan the software code. The below code is a snippet code developed using Java which analyzes arras, classes, methods and constructors.
In this case, the above code tests if a token is a constructor. We then have to change the express string to the correct RegEx (some of the these expressions are above):
‘exprsn’ variable contains the RegEx for matching with the tokens ‘want’ variable contains the test case entered by the user to check if RegEx is correct
The loop continues until the user types exit. The loop is created to test all the possible combinations and cases of the input.
The below code developed using the regular expression using java. The objective of the code to identify the single and multiple variable as well as constructor methods.
Proposed software reuse code from mobile App (SRCM)
SRCM is a technique through which we can identify the code reusability. How the code is reused in the mobile application is another challenge. We can estimate the percentage of codes that are exclusive to a particular app and to which class the code belongs. The proposed estimation technique is known as PCQR (the percentage of class name uniquely reused).
In the above proposed equation it is clear that, when PCQR is high, the reused class signature is high as well.
From the literature survey, it is clear that almost all the mobile applications reused their software, code, architecture, and/or functionality. The survey pointed out that approximately 86.56% of the class signatures match the older versions of mobile apps.
Not only code will be reused but also the framework and the architecture for developing the new apps within the stipulated stage of time, i.e. FAR (Framework and Architecture Reused), are reused. It has been observed that new apps have a similar framework and architecture where a list of classes and methods are the same and exhibit the same functionality. The numerical models have been projected for the said task:
X and Y are the two classes where the signatures are similar. If the set functionality of class X is similar to the class Y then;
s(X) is set of signatures in app X.
A high reused figure means that maximum class prototypes are reused in the new apps.
Client login portal.
Welcome screen.
Mail or call us.
Service provider.
Selection area.
Easy hire.
The below mentioned figures are snapshot of the MobileApps. The user has to log in to MobileApps by providing the user ID and password (Fig. 5). Once the user has logged in to the Apps, a welcome screen will appear (Fig. 6).
Figures 7 and 8 are used for job seekers who have to call or mail the employers so that different types of services can opt from the Apps.
Proposed software reuse proneness prediction.
In this section different predictions or classification algorithms including decision tree (DT) algorithm, Logistic Regression (LR), Logarithmic Regression (LRR), Naïve Bayes (NB), Pearson regression (PR), Support Vector Machine (SVM), Multivariate Adaptive Regression Spline (MARS), Artificial Neural Network (ANN), and Adaptive Genetic Algorithm (AGA) based ANN are discussed for reusability prediction (Fig. 11).
Adaptive genetic algorithm (AGA)
Genetic Algorithm (GA) is an adaptive search method for finding optimal or near optimal solutions, premised on the evolutionary thoughts of normal selection. The fundamental concept of GA is focused on simulating processes in the natural system required for evolution, distinctively those that consider the Charles Darwin principles representing the terms of the survival of the fittest. Considering procedural flow, GA at first generates the initial population arbitrarily, where the population refers to a set of solutions. The discussed answers are nothing but a chromosome that possesses a form of binary strings where all the comprising parameters are supposed to be encoded. Generating the population, GA estimates the fitness function of individual chromosome. As per retrieved fitness values, offspring are produced using genetic operators – crossover and mutation. Applying these genetic operators, the generations of the population are repeated iteratively until the stopping criteria are satisfied and an optimal solution is achieved. As illustrated in Fig. 3, the proposed ANN model comprises
Here, the individual weight, which is considered as gene in the chromosomes of the A-GA, is a real number. Considering the gene length or the number of digits is
Pawlak [15] introduced rough set analysis (RSA) as a generic approximation technique for a conventional set. Generally, before preprocessing the data set, we required the preliminary information, but in rough set analysis, supplementary information about data is not required. The intension of this analysis is to find the hidden pattern from the data set: it agrees to produce in a routine method the sets of choice rules from data, and it is suitable for simultaneous (parallel/distributed) dispensation.
The chronological accomplishment approaches of the rough set analysis method are obtainable as follows
During this stage, the pull out features from the CK metrics for every class are acquired.
In this stage, willingness data is discredited as a result of means of K-means clustering algorithm.
In this phase, the lesser at the same time greater estimate value can be achieved as the combination of the entire includes sets (Phase-2) present in X.
Mathematically,
Here, the greater estimation corresponds to the combination of each and every set, having component non-empty (say non-zero) connection with X.
Mathematically,
An issue suggestive of correctness of
Here, the cardinality of a set represents the total number of objects present in the lower or upper approximation of.
During this phase, every possible set is preferred in such a way that their (individual) accurateness equals the correctness of the common set.
During this phase, the retrieve data set with least amount probable cardinality is chosen as the condensed set and is further used for categorization processes.
Decision tree based classification has been suggested for a long time [16, 17] and various enhancements have been incorporated. In the current decade, machine learning techniques are dominant in research, i.e. decision tree. Decision tree based categorization has been recommended for a long time and a mixture of improvement has been included in the supervised learning category. It accepts both kinds of definite and incessant input and output variables. Two kinds of modifications are allowed: C4.5 and C5.0, which are task-related on association rules and contain important appreciation towards mining and classification. In this paper, C4.5 decision tree algorithm [18] has been applied that uses recursive partitioning of the metrics data so as to classify classes as REUSABLE and NON-REUSABLE.
Logistic regression
Logistic regression is a type of regression analysis technique, typically applied to predict the results of a certain dependent variable on the basis of one or more independent variables [19]. Typically, a dependent variable can have only two values, and therefore the dependent variable of a software component or the class encompassing reusability is split into two clusters, where one cluster contains non-reusable components and another encompasses the components with minimal single reusability. As stated, in this paper LR technique has been used to form the prediction model that assesses reuse proneness of the classes in the web of service software. Here, selected CK metrics have been used in combination. Mathematically, LR can be represented by following equation:
Where
Logistic regression performance
Performance measurement of ML algorithms in RozGaar.
This paper presents the prototype model developed in Java for test and validation purposes. Apart from this, the Apps developed in two versions: one is a mobile app and the other one is a website. The minimum requirements are addressed: we have used RAM 24 MB and the Intel Q8400 2.66 GHz processor. Apart from these some of the other parameters are size of the population, crossover, mutation rate, and chromosome length, all of which can be set through the system.
In this research work, the overall algorithms use MATLAB 2015a software tool. Since the proposed work intends to assess the reusability assessment for web of service software developed on object oriented software design paradigm, at first WSImport tool had been applied that converts web of service software projects into a Java file. Once the projects were converted into Java files, the respective classes were obtained using the CKJM tool. CKJM estimates the object oriented CK metrics values, especially for WMC, DIT, NOC, CBO, RFC, and LCOM metrics. Once the CK metrics values were retrieved, linear Univariate regression had been applied to estimate the reusability threshold values for the data. This was followed by features extraction, RSA based feature reduction and cluster validation, which was then followed by reusability prediction using different prediction techniques. The overall algorithms have been developed using MATLAB, and the prediction outcomes have been obtained in terms of a confusion matrix (see Table 4). Thus, applying the confusion matrix, the performance of the respective classifiers in terms of reusability prediction accuracy and F-Measure was obtained.
Results
We have derived the confusion metrics from our MobileApps (RozGar). From Table 4 it can be concluded that the accuracy rate is 93.41% and the F-Measure is 97.45% which is the highest in terms of the performance measurement of the MobileApps, i.e. decision tree (DT). In Table 3, the performance for accuracy is 87% and F-Measure is 91.85%, which is the second highest performance of the Apps. The Apps provide the least performance in the Naïve Bayes accuracy 81.32% and F-Measure 87.94%. Figure 12 indicates that DT induction provides better results: accuracy is 93.41% and F-Measure is 95.45% in comparison to other ML (machine learning) algorithms. In order to categorize the exact forecasting for the job scenarios from the Apps (RozGaar), this paper used novel techniques of the ML algorithms. The algorithms like DT induction, AGA (Adaptive Genetic Algorithm) are used to optimize prediction. This algorithm is used to select the model parameters of SVM (Support Vector Machine) for obtaining a better prediction performance. We have proposed one model which will predict the job the opportunity from the mobile Apps. We have taken the data sample sets like 54 numbers of web based application projects (most of them are developed Android Apps) and tested. Finally, the algorithm decision tree provides more accuracy (Table 4).
Conclusion and future scope
The whole system activities can be divided into two major parts: clients and service providers, although administrative services are also there to maintain the RozGaar application. Each one has their own role to perform and the system responds accordingly. Our developed MobileApps predict the accuracy as 93.41% and F-Measure as 95.45%. In the near future, researchers can use several machine learning algorithms to predict more jobs. Researchers can use suitable algorithms such as SVM (Support Vector Machine), Polynomial Regression, and GA-ANN which may be more accurate. Furthermore, the AGA-SVR model gives a better forecasting performance than the other models, such as SVM, rough set analysis, and linear regression. Consequently, AGA (Adaptive Genetic) and DT (decision tree) induction could be considered as one of the efficient alternative method for forecasting jobs.
