Abstract
Credit scoring plays a vital role for financial institutions to estimate the risk associated with a credit applicant applied for credit product. It is estimated based on applicants’ credentials and directly affects to viability of issuing institutions. However, there may be a large number of irrelevant features in the credit scoring dataset. Due to irrelevant features, the credit scoring models may lead to poorer classification performances and higher complexity. So, by removing redundant and irrelevant features may overcome the problem with large number of features. In this work, we emphasized on the role of feature selection to enhance the predictive performance of credit scoring model. Towards to feature selection, Binary BAT optimization technique is utilized with a novel fitness function. Further, proposed approach aggregated with “Radial Basis Function Neural Network (RBFN)”, “Support Vector Machine (SVM)” and “Random Forest (RF)” for classification. Proposed approach is validated on four bench-marked credit scoring datasets obtained from UCI repository. Further, the comprehensive investigational results analysis are directed to show the comparative performance of the classification tasks with features selected by various approaches and other state-of-the-art approaches for credit scoring.
Introduction
Credit risk is the principal concern for financial institutions, and the viability of credit risk assessment is the principal issue for the endurance and improvement of financial institutions. It is indicated by Thomas et al. [25] “Credit scoring is a set of decision models and their underlying techniques that aid credit lenders in the granting of credit” [17]. In other words, Credit scoring is a way to determine the risk associated with credit products and applicants [19, 27]. Further, based on evaluated credit risk an applicant can be categorized as “good credit or legitimate”, “bad credit or suspicious”, or “moderate credit” classes [26, 29]. The discriminative ability of the credit scoring model is important for financial institutions and a slight improvement in predictive precision could result in a noteworthy improvement in revenues or reduces potential losses for financial institutions [6, 28]. Various financial institutions are carrying out credit scoring in various steps such as: Application scoring concerns for the evaluating the legitimateness or suspiciousness of new applicant based on his social, financial and other status, Behavioural scoring is applied on active consumers to analyse their behavioural patterns to support “dynamic portfolio management processes”, Collection scoring is about to separate the consumers into various groups to put appropriate attention to appropriate group and Fraud detection ranks the consumers according to the relative likelihood that consumer may be dishonest [13, 22].
From the literature, credit risk evaluation has been viewed as classification problem, and found machine learning as a reliable way to explore hidden patterns from applicants’ details. In this context, a range of Machine Learning (ML) techniques such as “Artificial Neural Network (ANN)” and “Support Vector Machine (SVM)” are utilized to model the risk evaluation systems and to improve credit risk prediction. Li et al. [14] have projected a credit assessment model using SVM with optimized hyperplane in order to maximize the margin of separation for binary class to recognize potential candidates for consumer loans. An approach based on “Least Squares Support Vector Machine (LS-SVM)” with Bayesian evidence framework to categorize the reliability of potential corporate clients [31]. Zhou et al. [37] have applied weighted SVM with Genetic Algorithm (GA) based parameters optimization and t-test based features weighting for credit scoring model etc. West [34] has presented a comparative analysis of various neural networks, Parametric models and Non-parametric models for classification and evaluated the performance in terms of classification accuracy.
As credit scoring dataset comprises of various status such as financial, social, personal, etc. of a credit applicant. So, it has a large number of features and some of the features may be irrelevant. With high-dimensional and heterogeneous features, credit scoring models will be unstable and will have high computational complexity [33]. Hence, selection of significant features (or elimination of extraneous feature) is a way to decrease the computational complexity and to get better accuracy [18, 33]. Feature selection is an approach to choose input features which are more relevant towards the particular outcome [6]. The major intention behind the feature selection is to improve the prediction performance faster and cost-effective. Because, with a large number of features, noise may be enhanced which can affect classification accuracy and it also consumes more time to train the classifier. Feature selection methods can be used to isolate and eliminate irrelevant and redundant attributes from a dataset which reduce the predictive capability of a model. Suppose that there are total N number of features in the dataset then for selection of M features, there are (N! / (M! *(N-M)!) number of possible combinations. To check all possible combinations to find out which set is the best features subset is a heuristic based approach and not a real time solvable problem. Thus, meta-heuristic Binary BAT Algorithm (BBA) has been utilized to find the approximate optimal subset of the features. Moreover, a new objective function is proposed for this combinatorial optimization problem.
Reminder of the article is structured as follows: Section 2 describes the BAT optimization approach, Section 3 presents the proposed approach for feature selection, Section 4 exhibits a comparative analysis with state-of-art approach for credit scoring followed by the concluding remarks.
Binary BAT algorithm
Bats (micro bats) have a special ability of echolocation. Basically, bats emit loud and a short pulse of sound and wait for a few amount of time. When they receive echo returns, they can calculate the distance of the object. Using this ability of bats, Yang (2010) [36] has discovered a new meta-heuristic optimization technique and named it as BAT Algorithm (BA). In this algorithm, a team of bats traces for the food or prays using their echolocation ability. Using the behaviour of the bats, Yang (2010) [36] has proposed some rules based on the echolocation characteristics of bats, these idealized rules are presented below.
Equations 1-3 could guarantee the exploitability of the BA then also in order to enhance exploitation capabilities, random walk procedure has been added and as in Equation 4.
As, BA works on continuous search space and feature selection is a binary search space optimization problem. So, the new positions of bats can be updated by the Equation 3 by adding the velocity with earlier best position. But, in case of discrete or binary spaces the position must be presented with either 1 or 0. So, updating the position of binary spaces differ from continuous spaces. Binary version of BAT algorithm (BBA), which is similar to BA algorithm is proposed with difference in transfer function and position updating [20, 30]. Mirjalili et al. (2014) [20] have revised the transfer function to map the continuous search space to discreet search space as given below in Equation 9.
Further, particles’ positions are reorganized as in Equation 8 by considering the updated velocity as in Equation 7.
The limitation of above Equations 7 and 8 is that there is hard threshold to convert the Pos values into either 1 or 0. So, the position of the bats are not changing when their velocity increase. To solve this problem authors in article [20] have proposed a V-shaped transfer function and position updating rule as follows in Equations 9 and 10.
The proposed work emphasizes on improving the classification accuracy by reducing the number of features on credit scoring datasets. This section presents feature selection approach and proposed fitness function.
Feature selection
In this work, a new feature selection algorithm using BBA algorithm with a novel fitness function is proposed. Toward to feature subset selection, main motive is to select a set of less and valuable features which improves the classification performance. The proposed architecture for feature selection by utilizing BBA is as shown in Fig. 1.

Architecture of the proposed model for feature selection.
For feature selection, dataset is divided into two parts training dataset and testing dataset denoted as Tr and Ts respectively with all features. In this algorithm, first the population, positions, loudness and pulse emission rate of bats are initialized. Positions of the bats are selected randomly and values must be either 0 or 1 with the size of total number of features in that dataset. If it is one, it represents that feature is present else it is absent. Further, new training and testing datasets are generated from Tr and Ts i.e. D1 and D2 as per the bats positions. Classifier is trained on D1 and tested on D2 to calculate the fitness value against each bat. Further, the loudness L
i
and the rate of pulse emission R
i
are updated as per the Equation 5 and 6 respectively, if a new solution has been accepted. Generally, the pulse emission rate increases and the loudness decreases after a prey is caught by a bat. Steps of feature selection algorithm are as follows: Initialize the population and positions of bats. Size of bat must be as the size of features of dataset chose the random values either 1 or 0. Where, 1 represents corresponding positioned feature is present and 0 is not present. Initialize the velocity, loudness and frequency of bats. Create the training and testing datasets form the original dataset. For each bat generate the training and testing datasets with selected features (D1 and D2). Calculate the fitness value for each bat and find the local best based on the fitness value. Update the velocity, loudness and frequency of bat. Repeat the step 2 till number of iterations or fitness value is less then number of max iterations or threshold respectively. Find the global best as G
best
and position of G
best
is as selected features.
For formation of fitness function, sensitivity, specificity and cost of selected feature set are considered. Thus, the main objective is to search the bat position with higher classification performances with less number of features. By keeping the aforementioned points, a fitness function is designed which combines the multiple criteria as Sensitivity (Sen), Specificity (Spe), number of feature and as defined in Equation 12. Sensitivity is associated with pre-defined weight W
a
and it can be adjusted to 1 (in case of sensitivity is the most important) and same way specificity and cost of selected feature are associated with weights W
b
and W
c
respectively. Cost of selected features is considered as ratio of total number of features as F
t
and features selected (as per the bat position) as F
s
. Main motive is to improve the classification performances which is maximization problem with less number of features which is minimization problem. Maximization and minimization can’t be mapped in single fitness function. So, number of features in feature set is converted to maximization by
The probability of preserving a bat with a higher fitness-value for the next generation is quite high. Weights can be adjusted (weights to sensitivity, specificity and cost of the feature) as per the requirement of optimized fitness value.
This section comprises of three sub-sections which represent the description of credit approval datasets and performance measures used in this experimental work, comparative result analysis of proposed approach with some existing feature selection approaches and comparative result analysis with prior works.
Credit approval datasets and performance measures
Four datasets are used in this experiment namely: “Australian Dataset (AUS)”, “German (categorical) Dataset (GCD)”, “German (numerical) Dataset (GND)” and “Japanese Dataset (JPD)” acquired from the UCI Machine Learning Repository [1], and comprehensive explanation about datasets are tabulated in Table 1. All aforementioned datasets are real-world (bench-marked) credit scoring datasets and have a combination of categorical and numerical attributes.
Description of credit scoring datasets
Description of credit scoring datasets
There are various performance measures to evaluate the classification. The most popular performance measures is accuracy. But, with imbalance dataset towards a specific class, accuracy is not adequate as measure for performance evaluation of a model. Because, with well prediction of only majority classes samples can show high accuracy. So, in this work, another measure “F1-score is a measure of a test’s accuracy and considers both the positive and negative accuracies of the test to compute the score” is considered. Accuracy and F1-score are calculated as per the Equations 19 and 20 respectively.
The experimental results described in this section are performed on HP PC with 3.60 GHz Intel Core I7 8 gen CPU, 16 GB RAM and 64 bit Windows 10 operating system. Implementation is done using Matlab R2012a. Pre-processing is an imperative phase in case of machine learning, here “treatment for the missing values”, “ data-transformation”, “data-normalization” followed by “data-sampling” are considered. After the pre-processing of datasets, AUS, GCD, GND and JPD datasets have 250, 653, 1000, 1000 and 601 samples respectively. Further pre-processed dataset is separated into training dataset (with 75%) and test dataset (with 25%). Meantime, the proportion on the two class (healthy or bankruptcy) of both the training and testing set remain the same as the original one. As per the proposed fitness function (as in Equation 12), three predefined weights W a , W b and W c are required. For this work, we have considered W a = 0.48, W b = 0.48 and W c = 0.04. As, BBA is population based approach, all the experiments are conducted with population size as 50 and number of iterations as 100. In this work, we have applied proposed approach with three classifiers namely RBFN, RF and SVM. So, for RBFN the parameter σ = 0.3 is used and it shows the spread, for RF we have experimented with 100 no of trees and in case of SVM C=0.7 which is the regularization parameter.
Figures 2a–2d present the convergence curves of feature selection approach with respective dataset and classifier. In these figures, horizontal axis denotes the number of iterations and vertical axis denotes the corresponding fitness value. BBA-RBFN, BBA-RF and BBA-SVM show the convergence graph of BBA with classifier RBFN, RF and SVM respectively. As depicted in Fig. 2a, the BBA has been conversed within 20 th , 10 th and 40 th iterations and achieved the highest fitness value with each RBFN, RF [24] and SVM on Australian dataset and BBA-RBFN has obtained the highest fitness value. Similar to Australian dataset, In case of German categorical, numerical datasets along with Japanese dataset as depicted in Figs. 2b, 2c and 2d respectively, proposed approach with RBFN (BBA-RBFN) achieves the highest fitness values.

Convergence curve of BBA based feature selection with respective classifiers on (a) AUS, (b) GCD (c) GND (d) JPD datasets.
As, BBA is population based optimization approach, sometimes it is converging better and sometimes may not. So, In order to show the stability of proposed approach, this procedure is repeated 10-iterations with different set of training and testing dataset. Dataset with optimized features set in each iteration is segregated by 10-fold-cross-validation (10-FCV) and mean of 10-FCV are considered for comparative analysis. Mean of 10-FCV results in terms of Accuracy and F1-Score are depicted in Figs. 3–6 with respective datasets and classifiers. From the Figs. 3–6, it is observed that RBFN has better classification accuracy and F1-score as compared to SVM and RF in most of cases.

Comparison graph of RBFN, SVM and RF on Australian dataset.

Comparison graph of RBFN, SVM and RF on German (categorical) dataset.

Comparison graph of RBFN, SVM and RF on German (numerical) dataset.

Comparison graph of RBFN, SVM and RF on Japanese dataset.
Further, for comparative results analysis, various feature selection approaches such as “Stepwise Regression (STEP) [35]”, “Classification & Regression Tree (CART) [3]”, “Multivariate Adaptive Regression Splines (MARS) [7]”, “Correlation Coefficient (CORR) [8]” and “Multi-Cluster Feature Selection (MFCS) [4]” are considered. Mean of 10-FCV, with 10-iterations is utilized for comparative analysis and are presented as in Table 2 with respective dataset. In Australian dataset, as the results are tabulated in Table 2, PA achieves 87.61% and 84.93% accuracy and F1-score respectively. PA with RBFN beats the best feature selection method CART and makes improvement of 0.96% and 0.23% with accuracy and F1-score performance measures respectively. Further, PA is also applied with two more classifiers as RF and SVM. With both classifiers, PA achieves better performances. Overall, PA improves the classification performances of RBFN, RF and SVM, and PA with RBFN has the best classification performances as compared to feature selection approaches STEP, CART, MARS, CORR and MCFS with RBFN, SVM, RF based classification. Similar to Australian dataset, proposed approach has best classification performances as compare to other approaches except than MCFS with RF and CORR with SVM has the best performances on Japanese dataset. Overall, it can be concluded as proposed approach for feature selection has better classification performances with RBFN, RF and SVM on four credit scoring datasets.
Performances of RBFN, RF and SVM with various feature selection approach on various credit scoring datasets
This sub-section presents a comparison of outcomes acquired by proposed method with the outcomes received from literature on credit scoring datasets. These results are tabulated in Table 3 in terms of classification accuracy with respective dataset, approaches applied along with respective references. From the outcomes as depicted in Table 3, it is determined that proposed approach has achieved the best performances in case of Japanese, German (categorical) and German (numerical) dataset. And, it achieves 4 th best performance in case of Australian dataset. As, this study focused to combine the feature selection with classification. So, inclusive it can be determined that proposed approach (PA-RBFN) has the best performances with most of the real world credit scoring datasets.
Performance of various credit scoring models on credit scoring datasets
Performance of various credit scoring models on credit scoring datasets
In this paper, Binary BAT algorithm based feature selection method has been proposed with a novel fitness function. Fitness function is based on classification performances and cost of features selected (bat position) by Binary BAT algorithm. Proposed approach has been experimented on credit scoring datasets with three classifiers namely: RBFN, RF and SVM. Further, results are compared with various feature selection approaches such as STEP, CART, MARS, CORR and MCFS with RBFN, RF and SVM and various credit scoring models obtained from the literature. From the experimental outcomes, it located that proposed method for feature selection with RBFN has better performance as compared to same with SVM and RF in terms of accuracy, F1-score and convergence rate. Features selected by proposed approach are more representative and improves the classification performances of RBFN, SVM and RF as compared to existing feature selection approaches. So, overall it can be concluded that proposed approach has the best performances with most of the real world credit scoring datasets.
