Abstract
In today’s business world, identifying the customers and analysis of their behavior is important for banking industry. Customer Relationship Management (CRM) is the process of maintaining profitable customer relationships by delivering customer value and loyalty. Moreover, CRM helps to improve the business relationships with customers. The goal of CRM is to maximize the lifetime value of a customer to an organization. Customer Lifetime Value (CLV) can rank and classify customers based on their lifetime value to identify valuable customers and retain them. There are several models for CLV estimation using the past data of customers. This subject helps organizations in their attempts to retain valuable customers. The banks must use appropriate data mining techniques to extract pattern and information from the existing data to gain competitive advantage. Therefore, data mining techniques have an important role to extract the hidden knowledge and information. The goal of this study is to review data mining techniques used for analyzing bank customers in order to help the banks to better identify their customers and design more efficient marketing strategies. The literature covered in this paper is related to the past seventeen years (2001–2017) and these approaches are compared in terms of data sets, prediction accuracy, and so on. We also provide a list of data sets available for the scientific community to conduct research in this field. Finally, open issues and future works in each of these items are presented.
Introduction
In today’s business environment, it is essential to identify and analyze the customers’ needs to gain a competitive advantage. Managers should try to retain customers and focus on key customers in order to increase their costs and gain profit. Today, customers share their negative experiences together using communication technologies, which can lead to the loss of current customers’ trust. Organizations can identify customers and meet their needs to increase customer loyalty. Therefore, retaining key customers for banks is more beneficial than attracting new customers [1]. In today’s continuously changing competitive business environment, the organizations have to analyze and understand customer needs and behavior. “Customer dynamics” is one of the most important issues to be considered when analyzing customer behavior, because customer behavior is often complex and uncertain in today’s dynamic situation. Considering the dynamics of large organizations can cause improvements [2] and organizations with an understanding of customers’ behavior can improve their marketing strategy using behavioral scoring models that help to analyze the behavior of customers [3]. Obtaining customer satisfaction is a modern approach for quality control in organizations. To reinforce customer orientation, many organizations choose customer satisfaction as their main performance indicator, although it is almost impossible to achieve [6]. The loss of customers in an enterprise can reduce its profits. The cost of attracting a new customer is five times higher than that of retaining the existing subscribers. Therefore, customer retention is a core Customer Relationship Management (CRM) issue. CRM creates a strong link between the organization and customers, which finally increases customer loyalty, and enhances return of investment. Organizations can boost customer relationship management and prevent the loss of key customers [4, 57], an improvement that leads to competitive advantages. Companies recognize that CRM is a fundamental tool for building customer value and this topic helps to increase enterprise value [5, 54]. Organizations try to increase Customer Lifetime Value (CLV) through CRM. There are many methods for evaluating the CLV that can be used by organizations to gain profit. The spectrum of research goals is shown in Fig. 1.
Research’ goals spectrum.
Timeline of the researches in the literature.
In Fig. 2, a timeline of researches in the literature is shown. For this propose, this paper reviews the researches belonging to CRM and CLV. The rest of the paper is organized as follows: in Section 2, CRM, CLV, and customer segmentation are briefly introduced. In Section 3, the data sets used in recent researches are reviewed. Section 4.1 introduces data mining (DM) techniques applied on CRM, CLV, and customer segmentation and in Section 4.2, the literature is categorized according to the techniques. We will analyze these researches from different aspects and try to provide a comprehensive overview of the literature on this subject, which can be helpful for researchers interested in this field. Evaluation results of the studied papers are illustrated in Section 4.3. In Section 5, we will pay attention to open issues in this context. Finally, we will comprehensively discuss the researches and their results in Appendix II. This information is collected in one table and can thus be easily compared.
CRM and e-CRM
Customer Relationship Management (CRM) is a strategy that allows a business to manage its customer relationships. CRM is known as a process for improving the relationship between organizations and customers. It is a practice for creating tolerable connections between an organization and its customers. The goal of CRM is to forge closer and deeper relationships with customers. The most important factors that affect customer loyalty are satisfaction, trust, commitment, perceived value, perceived quality, intuitive image, empathy and switching barriers. Through management of relationships, CRM attempts to attract and retain customers, increase customer loyalty, customer satisfaction, and creation value for customers. According to studies, the cost of attracting a new customer is about five times more than retaining an existing one. Therefore, enterprises found that it is critical to develop a long-term relationship between themselves and their customers to achieve profitability and customer satisfaction in the long run [7, 10, 37, 38, 50, 51, 52]. CRM includes four dimensions: customer identification, customer attraction, customer retention, and customer development. For example, customer satisfaction has many benefits for firms, such as enhancing firm reputation, attracting and retaining customers, increasing customer loyalty, etc. [8, 13, 15]. Note that in [9], the authors surveyed the existing techniques in the field of CRM and explored some of the main challenges. However, this review did not focus on banking industry and had a general scope. In fact, performance evaluation and CRM in the banking industry are more complicated and involve several factors [49].
The use of Internet for commerce presents an opportunity for businesses to use it as a platform for the delivery of CRM functions on the Web (e-CRM). E-CRM expands the traditional CRM techniques by integrating new electronic channels such as web, wireless, and voice technologies, combining them with e-business applications into an overall enterprise CRM strategy. E-CRM is the adaptation of CRM in an e-commerce environment and helps to create and maintain customer relationship using the net [11, 12].
CLV
Customer Lifetime Value (CLV) is a pattern of analytical CRM. The goal of CRM is to maximize the lifetime value of a customer for an organization. Evaluation of CLV is important for enterprises in terms of marketing to have an effective CRM. Studies have shown that the past behavior of a customer may not always predict their future behavior. Therefore, we need a metric to predict the future profitability of a customer. CLV is a metric that represents the total net profit of a company from any customer. CLV is a concept in CRM domain for evaluating the customer’s value. In addition, it is an important metric for determining how much money a company intends to spend to acquire new customers and how much repeated business a company can expect from certain consumers. CLV can help you understand your most valuable user segments based on customer value over time and is thus considered as an important subject. CLV can rank and classify customers based on their lifetime value for identifying valuable customers and retaining them, as well as identifying and comparing market segments [7, 13, 17, 18, 58, 59]. There are several models to estimate CLV using the past data of customers through which we can determine the customers who are more profitable than others. This topic helps the firms to retain their valuable customers [16]. CLV models are used to evaluate customer loyalty, recommend right products to right customers, and provide individual marketing decisions for each customer. Some CLV models are based on past behavioral models and some on future customer revenue. In addition, a number of CLV models can prospect the future monetary value of customers [14].
In the following, a summary of the some key CLV models are provided, which are divided into tree corresponding categories [17, 18]:
Research’ goals spectrum
Research’ goals spectrum
Models for calculation of CLV: This category includes models that are specifically formulated to calculate the CLV. Models of customer base analysis: The analysis of past behavior of customers to predict their future behavior. Normative models of CLV: These models have been developed to understand the issues concerning CLV.
Customer segmentation is the process of dividing customers into distinct and meaningful groups of homogeneous customers on the basis of common attributes. Customer segmentation increases customer satisfaction and the company’s expected profit. Customer segmentation is so important for companies to develop effective marketing strategies. The use of various marketing strategies leads to an enhanced value of customers.
Customer segmentation can be done based on various criteria, including degree of customer loyalty, purchase frequency, purchase volume, demographics, etc. The banks can differentiate themselves from their competitors through customer segmentation. Nowadays, customer segmentation has become important in the banking industry, and much research has been done in this respect. On the one hand, customers are demanding quality services from their banks, so that the banks have to keep their customers by customer segmentation to enhance the quality of their services. On the other hand, identifying valuable customers is an important issue and segmentation is an appropriate method to recognize the customers. Cluster analysis is defined as “partitioning data into meaningful subgroups, when the number of subgroups and other information about their composition may be unknown”. There are many models for customer segmentation like SOM (Self-Organizing Map), fuzzy method, RFM (Recency, Frequency and Monetary), combination of some models (RFM, AHP, and K-means algorithm) and so forth [2, 19, 20, 21, 22, 59, 61].
Data sets
All the researches about CRM and banking industry have used a data set. They applied several DM techniques to achieve their goals. In this study, data has been collected from banks and some data are taken from other industries. In Table 1, all of these data sets are shown in brief.
Supervised techniques used in CRM, CLV and bank customer segmentation.
Data mining (DM) techniques
DM techniques can help a company to discover and extract useful information from their data warehouses, which cannot be discovered directly. This information can be used for the prediction of their customer behaviors. DM plays an important role for improving customer relationship management. Organizations use DM to improve their competitive advantage and add value to the customer. Therefore, organizations can discover customer needs to increase competitive advantages, respond to the expectations of customers and offer quality service. In recent decades, DM has attracted a lot of attention in data science for diagnosing heart diseases [23, 52], designing software architecture [40, 41, 42, 43], selecting design pattern [44, 62], traffic accident prediction [63], fraud detection [64], ATM (Automated Teller Machine) management [65] and so on, and its techniques have been applied on almost every subject to analyze data and get accurate and reliable results based on the defined goals. Many studies have employed DM to analyze customer data but some have attempted to discover new DM techniques [8, 5, 17]. The important issue is how we can gain better insight and knowledge through using DM techniques to improve CRM and increase CLV and thereby create more profit for customers and company [2]. DM is a technology for extracting information from customer data, which is useful to identify customer demand, effectively promote customer value, and predict their future behavior. DM can also be considered as a process and a technology that uses statistical algorithms to discover hidden patterns and information from the existing data to gain competitive advantage [9, 25, 26]. Organizations have started using DM technologies to gain customer loyalty and increase the contribution of customer value [46].
DM techniques have been used to predict customer behavior and can obtain previously unknown and potentially useful information (including knowledge rules and regularities) by searching through a database. DM is a stage in Knowledge Discovery in Databases (KDD). In today’s competitive business environment, organizations can improve their competitive advantage by using DM. Various applications are used for marketing, finance, and banking. Applications in these domains help to collect and storage a large amount of data. DM gains models from database and discovers patterns and correlations in data and can discover useful customer behavior patterns from large data repositories. In a dynamic business environment, the companies have to analyze and understand customer needs and behavior. Organizations need to have a deeper understanding of customer behavior. Today, customer behavior is often uncertain and changes over time, so organizations can use a number of methods to predict changes in customer behavior and the company needs to focus on mining changes in databases [2, 15, 24, 26]. As we know, retaining a current customer and acquiring a new one is an important matter in banking industry. Therefore, we should use a number of methods for analyzing customer behavior, which are based on bank databases. Analyzing a bank customer is a difficult task because the bank databases are multi-dimensional and are comprised of monthly account records and daily transaction records. In addition, there are a variety of methods to analyze the customer behavior. In this regard, selection of the appropriate method for analysis is important.
In this paper, we review some DM techniques for analysis of bank customer behavior. DM techniques are divided into several groups as follows:
Supervised Unsupervised Evolutionary learning Other
In the following, we discuss these techniques.
Supervised
Several supervised techniques are used in knowledge extraction and some of these techniques have been applied on the bank industry, which are shown in Fig. 3.
Decision tree is a supervised classification technique in DM. Decision tree is a predictive model that can be regarded as a tree. This technique represents sets of decisions. The decision tree is most often used for classification of a data set and can produce understandable rules. Specific decision tree techniques such as Classification and Regression Trees (CART), Chi Square Automatic Interaction Detection (CHAID), and Commercial version (C5.0) are highly efficient [10]. In [8], CHAID and C5.0 decision tree techniques are used for classification of customers and help the banking industry to make decisions. They provided a set of rules that can be applied to a new data set to predict which records will have a given outcome. CHAID analysis is an algorithm used for discovering the relationships between variables. CHAID analysis builds a model or tree to help determine how variables best merge to explain the outcome for a given dependent variable. In practice, CHAID is often used in direct marketing to understand how different groups of customers might respond to a campaign based on their characteristics. In [26], authors used CHAID through Applied Matrix for an initial segmentation modeling. Applied Matrix helps companies increase their competitive advantage, the lifetime value of customer, and decrease cost of customer acquisition. In this study, CHAID identified market segments that were formed by interactions among predictors of a chosen criterion variable like customer age. Another technique of decision tree is C5.0, which is a later version of C4.5 and can use continuous data, information theory, and learning method to build a decision tree. C5.0 creates decision tree from a data set by using Gain and Gain Ratio parameters. In [28], authors have used the C5.0 model to produce rules for predicting the level of loyalty based on demographic variables, on the obtained clusters from k-means and two-step algorithms. Moreover, in [4], C5.0 has been used as a classification technique. CART is a classification tree and a form of binary recursive partitioning. In [27], CART analysis has been used to build a regression tree, which uses the customer base into a set of homogeneous sub-groups for clustering. This technique was applied to determine membership of a set of classes as a function of certain predictor variables.
Neural network is an information-processing technique to capture and represent complex data relationships. A neural network acquires knowledge through learning. Neural networks are organized as a layer set of neurons, which include input layers, hidden layers, and output layers, and the computations of the network are performed in the hidden layer. The layers are made up of a number of interconnected nodes. Neural network has been used for creating predictive models such as customer lifetime value. Neural network has a wide range and can be applied to both supervised and unsupervised DM and to solve estimation problems [25, 26]. In [29], neural networks have been used as the classification model and created a credit scoring. The neural networks are built with the data of the existing customers. Then, all these existing customers are evaluated by the model in order to detect their predicted credit status, which can be good or bad.
In [22], K-Nearest Neighbor (KNN) technique is used to classify and identify the goods that are more favorable to customers, which uses a database in which the data points are separated into several classes to predict the classification of a new sample point. In [22], the recommendation systems designed based on a combination of CF (Clustering features) and KNN were used in business.
In [25], Naïve Bayes (NB) classifier is used, which is based on Bayes’ rule as a simple probabilistic classifier. The researcher used Bayes’ rule as the basis for designing learning algorithms. Bayesian classifiers predict the probability that a sample belongs to a particular class. This technique has high accuracy and fastness to train with simple models and is thus used for large databases. According to [19], if the number of clusters is not determined, we can calculate their number by using Bayesian Information Criterion (BIC) and Akaike Information Criterion (AIC).
Unsupervised
Unsupervised is a DM technique for extracting hidden structure from unlabeled data, which is shown in Fig. 4. It does not require the humans to have a foreknowledge of the classes, and mainly uses clustering algorithms to classify an image data.
Unsupervised techniques used in CRM, CLV and bank customer segmentation.
In [2], the GSP (Generalized Sequential Pattern) algorithm is used to extract the sequential patterns. It is a popular algorithm used for sequence mining, which extracts the patterns that appear more frequently than a user-specified minimum support while maintaining their item occurrence order. In [21], Apriori algorithm is used to determine the group of customer by creating customer profiles and finding the relevant clustering rules.
In [3], SOM (Self-Organizing Map) has been used. It is an unsupervised learning technique that relates multi-dimensional data. This technique is an artificial neural network algorithm that contributes to mapping high-dimensional data into a two-dimensionally represented space. SOM has been used in a wide range of applications, including financial data analysis, medical data analysis, time series prediction, and industrial control. It is built with customer’s data, which include variables from account and transaction data sets. In this model, all the customer’s data are used to build the behavioral scoring model in order to predict potential customer behavior. In SOM, it is difficult to predefine a network size without obtaining prior knowledge about the organization of the data to achieve acceptable results. Therefore, the use of a Growing Hierarchical Self-Organizing Map (GHSOM) has been suggested.
Chain model is another unsupervised technique. In [27], a model has been used with a combination of first-order Markov chain modeling and CART. This model is based on the analysis of homogeneous groups instead of individual customers. The chain model has been used in marketing, including customer lifetime valuation.
Evolutionary techniques used in CRM, CLV and bank customer segmentation.
Clustering is a popular unsupervised learning technique. It is used for finding classes or groups of a data set with most similarities in the same cluster, while the dissimilar objects are in a different cluster. Clustering is a DM technique used to divide data into related groups without a foreknowledge of the group definitions. Clustering techniques are classified as hard clustering and fuzzy clustering. In fuzzy cluster or soft cluster, each point may belong to two or more clusters with different degrees of membership. The well-known hard clustering algorithm (K-means) and Fuzzy clustering algorithm are mostly based on Euclidean distance measure. FCM algorithm is a well-known Fuzzy Clustering method that permits one point to belong to two or more clusters based on fuzzy logic. In [7], FCM algorithm is used to cluster data into nine optimum clusters based on three values of recency, frequency, and monetary. In [1], fuzzy clustering was applied to collect and normalize data from 120 customers based on four different variables, namely length of the relationship, recency of trade, frequency of trade, and monetary value. K-means is an unsupervised DM technique and the most popular hard clustering technique that divides data into groups, and the objects in each cluster are very homogenous and dissimilar with other clusters. Each data point belongs to only one cluster. This technique requires previous knowledge about the number of clusters. It takes the input parameter
CF tree is a height-balanced tree that stores the clustering features for a hierarchical clustering. Each entry in CF tree represents a cluster of objects. In [19, 22], the authors have used CF tree. For example, in [22], CF is a well-known technique in recommendation systems, which can be categorized as neighborhood-based and model-based techniques. The CF is calculated based on the nearest distinctive neighbor for each cluster of customers.
As shown in Fig. 5, another DM technique is evolutionary learning. This technique can be used in any classification-based prediction scenario. In addition, it helps to predict the value of a user-specified goal attribute based on the values of other attributes. For example, the banks have used this technique to predict credit. Genetic algorithm (GA) is a meta-heuristic algorithm used for data clustering. In [1], this algorithm was used for customer clustering.
Other
In addition to the above-mentioned techniques, some researches have attempted to suggest alternative techniques, which we have categorized as “Other” and are shown in Fig. 6.
Other techniques used in CRM, CLV and bank customer segmentation.
Delphi method is used in [32] for finding actual CRM definition and customer’s characteristics in the future. This method is based on collective intelligence for finding a common consensus. How a business can gain knowledge through using DM techniques to support intelligent decision making in customer dynamics management is an important subject.
The RFM model is a common well-known method for customer dynamics segmentation. In [15, 47, 48], RFM can be considered as the most powerful and behavior-based model to implement CRM that extracts customer past information by using specific criteria. RFM values are defined as follows:
RFM is a technique for customer behavioral analysis that can effectively investigate customer values. In [24], the RFM scoring model is used to transform the customer behavioral variables. By RFM scoring, customers are segmented into various target markets in terms of customer value. RFM valuables are important hidden variables in the database. In many studies, it has been shown that the higher
LRFM is the extended version of RFM model that is used to take the length of the relationship into account. Authors in [18] represented a model to calculate customer life time value (CLV) based on LRFM customer relationship model, which consists of four dimensions: relation length (
Percentage of different DM techniques used in CRM, CLV and bank customer segmentation.
Some studies have added the count item to RFM model and implemented the RFMC model. The results revealed that the count item is not so useful and the outcome of RFM model was better than RFMC model [15]. Therefore, CLV is calculated based on weighted RFM method for each segment. In [56], an augmented RFM model called LRFMP to gain deeper and reasonable insight about customers are proposed.
A weighted RFM integrates AHP and DM into customer segmentation. Some researchers proposed WRFM (Weighted RFM) instead of RFM. They dedicated weights to
In the previous section, we studied DM technique used by some researchers for analyzing the customer behavior of banks and another finance industry. Now, we are going to evaluate these techniques based on the results of references. In Fig. 7, the usage percentage of different techniques in this literature is shown. As can be seen in this Fig. 7, unsupervised learning techniques have the highest usage percentage in this literature.
Evaluation criteria
There are a number of criteria to evaluate DM techniques used in this study. We define these criteria and will discuss their obtained results. At first, we define TP, TN, FP, and FN according to the recommendation system [45].
Note that the increase in recommendation causes a decrease in Precision and an increase in Recall.
Results of Dunn index for clustering techniques [30].
Comparison the Accuracy of different techniques.
Where
If SSE is closer to 0, it indicates that the model has smaller random error component and is useful for prediction.
For example, in [1], a combined algorithm (fuzzy c-means cluster and genetic algorithm) was proposed. Therefore, for comparing the efficiency of these algorithms, mean squared error (MSE) was used and its results are shown in Table 2. According to this study, it is clear that the combined clustering algorithm had a lower MSE; therefore, the authors suggested the use of this combination to obtain the most accurate cluster.
Timeline of the researches in the literature
We studied the usage frequency of each criterion in the literature. Accordingly, the most frequently used criteria are Accuracy, Precision, Recall, and F-measure. As mentioned earlier, Accuracy is the most widely used criterion in all classes of DM techniques, because it provides a good insight for users. Figure 9 shows the accuracy results reported by the researches. This figure gives important information by comparing different techniques and approaches in terms of Accuracy.
According to the results presented in [28], if Two-step and K-means for customer segmentation are used to measure customer loyalty, the K-means algorithm has a poor function in recognizing the medium and very high loyalty levels. The two-step algorithm has a very low Accuracy for recognizing a low level of loyalty.
In [8, 25], some DM techniques were used to classify customers and help the desired organization to make decisions. In [8], CAID and C5.0 decision tree techniques have been applied. According to Fig. 9, it is clear that the applied CHAID technique has more than 82% Accuracy, which is about 74% for C5.0 technique. As shown in Fig. 9, in [25], Neural Network and NB were used to predict the customer behavior and it was concluded that Neural Network is the best model for predictive performance with Accuracy rate of 88.63%.
According to our survey, numerous studies have been done on customer behavior until now. Some researchers believe that the importance of customer value in financial services industry is seldom realized. In banking industry, identifying the customers and their needs is an important matter. Assessment of the value of bank customers and determining their impact on the performance of banks are necessary to identify their key characteristics by using customer clustering. The banks can identify their most profitable customers and design marketing strategies for each group of customers by customer clustering. In this literature, we attempted to cover every aspect of this subject and almost reviewed all the studies conducted from 2001 to 2017, which are summarized in Appendix 2. In this paper, we conducted a research about data sets that have been used in recent studies. Moreover, we tried to review and summarize all DM techniques associated with analysis of bank customers. We reported experimental results presented by researches and the criteria used in them. Some studies have used a combination of DM techniques to increase Accuracy of their study. Moreover, there are many subjects in this domain, which needs further studies, and we are going to cite them in the following.
In today’s dynamic environment, identifying customers’ behavior is important for banking industry to gain a competitive advantage. The banks have to analyze their customers to identify their needs, which helps the banking industry to retain valuable customers and increase their CLV. Considering the importance of this subject, numerous studies have been done in this area in recent years. According to these studies, there are some issues that need more attention and we try to mention some of them in the following:
Some researchers have used a limited data set; therefore, researchers should prepare a general data set to achieve a more complete analysis and make better decisions in future. Because Accuracy of customer clustering is an important matter, it is necessary that future studies use a combination of fuzzy clustering algorithms and evolutionary algorithms. Numerous studies have been conducted on CLV and the authors of these studies recommend researchers to work on more measures and comprehensively consider them or rather compare various CLV models in a specific industry. Some researchers used WRFM techniques and clustering algorithms based on customer’s value to specify loyal and profitable customers; therefore, we recommend future researches to automatically set the weight of variables.
