Abstract
In marketing, problems such as the increase in customer data, the increase in the difficulty of data extraction and access, the lack of reliability and accuracy of data analysis, the slow efficiency of data processing, and the inability to effectively transform massive amounts of data into valuable information have become increasingly prominent. In order to study the effect of customer response, based on machine learning algorithms, this paper constructs a marketing customer response scoring model based on machine learning data analysis. In the context of supplier customer relationship management, this article analyzes the supplier’s precision marketing status and existing problems and uses its own development and management characteristics to improve marketing strategies. Moreover, this article uses a combination of database and statistical modeling and analysis to try to establish a customer response scoring model suitable for supplier precision marketing. In addition, this article conducts research and analysis with examples. From the research results, it can be seen that the performance of the model constructed in this article is good.
Introduction
With the rapid development and progress of the national economy and science and technology, especially the popularization of computers and the Internet, marketing methods have become more diversified, and customer needs have continued to develop in the direction of individualization and diversification. In this context, database marketing has gradually emerged. Compared with traditional mass marketing methods, database marketing has the advantages of accurate positioning, short response period, and easy measurement. Therefore, proper use of database marketing can lead to higher profits. In addition, database marketing provides unprecedented possibilities for tracking customers, understanding customers, and predicting customer needs due to its powerful data storage and mining functions. Therefore, it has opened up a series of new marketing fields that are difficult to reach by traditional marketing and has made rapid progress in the depth and breadth of marketing activities. The key to database marketing is data, whose quantity and quality are directly related to the success or failure of marketing projects. Using the method of combining database marketing and statistical models to establish a customer’s marketing response model can well solve the problem of low customer response rate in the current marketing activities of enterprises. Moreover, based on the customer database, customers can be accurately located through data processing and model construction, and the most likely customers to buy from many potential customers can be discovered [1].
In a practical sense, due to the continuous deterioration of the market and the gradual improvement of informatization, suppliers have obtained massive amounts of customer electronic data resources. The understanding and utilization of customer information data play an important role in the future development of suppliers and their position in the banking industry [2]. The current customer data analysis is still at a relatively low level, and CRM is still stagnant in collecting and collating customer information. Moreover, the direction of customer information DM and analysis is still confused at a glance, and data mining cannot be applied handily in the customer management system, which directly causes the supplier’s customer relationship management to be unable to effectively support the supplier’s marketing strategy. With the further exploration of big data and the bubble growth of customer surface information content, suppliers should focus their research on how to professionally extract and systematically sort out effective information, so as to provide more effective help to the marketing department. Therefore, this requires that customer relationship management should also have a certain analytical ability to analyze a more accurate customer list so that suppliers can more effectively implement customer relationship management measures [3].
Related work
Compared with traditional marketing methods, database marketing has obvious advantages. The literature [4] pointed out that using customer database for customer segmentation and strategic positioning can significantly improve marketing efficiency and reduce marketing costs. The literature [5] pointed out that some banks and financial service companies only rely on mass marketing strategies to recommend some new products or services to customers, such as through TV, radio, or advertising companies. However, many customers are not interested in this approach and do not respond to such promotions. The literature [6] pointed out that database marketing is to collect and store customer information and business characteristics for a long time, such as the customer’s purchase mode, purchase satisfaction level and so on. Therefore, the database can be used to achieve customer retention, customer activation, promotion of related products, etc. With the rapid development of database marketing, more and more companies want to expand their market through this marketing method. The literature [7] believed that customers’ requirements for service products are becoming stricter, which further stimulates the method of marketing through databases. Moreover, in a business survey, nearly 50% of the companies surveyed plan to increase investment in database development and management in the second year. The literature [8] put forward that more and more companies, especially in financial services, banking, insurance and other industries, use direct selling as their main strategy for communicating with customers. However, direct selling methods are facing severe challenges with the continuous increase in printing and mailing costs and the continuous decrease in response rate. The literature [9] pointed out that the response model is to predict the probability of a customer responding to a promotion. The response model mainly relies on historical purchase data for modeling. The use of the model can identify those customers who are more interested in promotions, so that costs can be greatly reduced without reducing the number of purchasers. The literature [10] refers to factors such as increasingly saturated markets, rising costs of telephone and direct mail, and increasing customer dissatisfaction with direct marketing methods, and other factors have prompted businesses to change their current marketing strategies and accurately select customers who receive responses. The method of establishing the marketing response model is logistic regression. The use of lift tables combined with graphs can easily realize customer tracking, and clearly shows the efficiency gain of using the model relative to random selection. The literature [11] pointed out that the use of data mining technology to analyze and process a large amount of data can find the hidden rules. In this way, companies can have a better understanding of customer characteristics and consumption habits, and then it is possible to deliver the best product and service information to appropriate consumers through appropriate channels. The literature [12] believed that data mining can improve positioning by selecting those customers who contact. Data mining is the process of exploring meaningful patterns and rules and analyzing large amounts of data. Simply put, data mining refers to extracting or “mining” knowledge from a large amount of data. The literature [13] compared the four classification prediction methods of multi-level neural network, Bayesian classifier, logistic regression and classification tree. Through comparison of accuracy and sensitivity, classification tree has the best effect.
The literature [14] introduced the characteristics of database marketing and its advantages over traditional marketing methods, that is, it can select target consumer groups to improve marketing effects. Moreover, the collection of customer information in database marketing is based on the enterprise, so it is concealed in marketing, and no longer rely on mass media for promotion, it is difficult for competitors to discover. This solves the problem that other companies follow in the traditional marketing model. The literature [15] pointed out that China Resources Vanguard adopts more targeted promotion information customized for different customers in the marketing information transmission. Moreover, this personalized marketing response rate is about 10 times that of ordinary marketing activities.
The literature [16] provides countermeasures to promote database marketing in my country based on the current situation of domestic database marketing, that is, to increase investment in this area, extensively collect customer information, and analyze customer data in a timely manner. The Literature [17] response rate of precision marketing model activities through foreign commercial banks’ focus marketing is much higher than that of mass marketing, and the number of domestic financial institutions and financial practitioners is large. The intensity of competition illustrates the construction of precision marketing. The necessity of management mechanisms. The literature [18] pointed out that the diversification of modern society’s needs is more and more prominent, and the needs of consumers are more and more differentiated, which requires the marketing strategy of enterprises to be more targeted to market segments. Moreover, the literature also summarized three ways of precision marketing, namely, precision marketing based on the Internet, database and third-party channels. Literature [19] believes that the response model is the most frequently used predictive model in companies, which can make marketing activities more targeted, thereby reducing costs and increasing the rate of return. However, because the responding customers in business all think that they take marketing activities, but in fact, most of the responding customers take marketing activities or not, so the commonly used methods cannot fully meet the business goals. The literature [20] subdivided the model into marketing activities and no marketing activities, so that we can judge whether the marketing activities are valuable. The literature [21] applied the logistic regression method to credit card marketing and established a credit card marketing responsiveness model. The final conclusion is that the success rate of using the model is 6.48 times that of random selection, with a greater degree of improvement and better application effects. The literature [22] mentioned that companies need to locate target users more accurately. Therefore, marketing activities often involve classification and prediction problems. This requires us to establish a response model to predict the probability of a user’s response to marketing activities and use the response probability to classify the user’s response to marketing activities, thereby reducing the company’s marketing costs and increasing the return on investment. The literature [23] used Logistic regression model to conduct broker risk detection research, and calculated KS statistics, ROC curve, lift value, etc. Literature [24] introduced a variety of credit evaluation model methods, including statistical methods, such as discriminant analysis, logistic regression, decision tree, etc., as well as non-parametric and intelligent evaluation, such as neural networks, expert system methods, etc. However, it compares the above methods. From the modeling point of view, statistical methods are based on mathematical statistical analysis, and are generally linear models, while neural networks and other methods are dynamic non-statistical models, which are somewhat difficult in interpretation but better in stability.
Selection and principle of modeling method of response model
The customer response prediction model is now more common in corporate marketing, because it uses more scenarios and is more mature. There are also many methods suitable for building models, such as logistic regression, decision trees, neural networks, random forests and other classification algorithms. However, in practical applications, we have to choose such a suitable modeling method that meets the modeling purpose and requirements and can be completed in the actual environment. Next, we compare various methods.
The first is data processing. When dealing with massive amounts of data, decision trees, neural networks, and random forests are algorithms in the field of machine learning. They have a high tolerance for data errors when processing massive amounts of data. When variable missing values or outliers are serious, its own algorithm can solve it. However, logistic regression has shortcomings in this respect. If logistic regression is used, the analyst needs to accurately process the data in advance, and the requirements for the analyst are relatively high.
The second is the explanatory aspect. Logistic regression is to establish the linear relationship between the independent variable and the dependent variable. According to the model parameters, the direction of each independent variable’s action on the dependent variable can be explained concisely and clearly, which is also convenient for subsequent analysis and research on the model parameters. Machine learning models such as decision trees use methods to explain the relationship between independent variables and dependent variables.
The third aspect is accuracy and stability. According to previous scholars’ experience using commercial bank data as response prediction models, the accuracy of neural network algorithms is higher than that of decision trees and logistic regression. Logistic regression is not much different from decision number algorithms, but logistic regression has the best model stability.
In this study, the target variable is response product marketing or non-response product marketing, which is a binary variable. We need to build a model to try to predict the target variable, but from this point, decision trees, neural networks, and logistic regression algorithms can all do it. However, this article hopes that the final model result is not only an all-or-nothing judgment on customers, but also a deeper analysis of the direct relationship between variables through the model results. Based on this logic, the regression algorithm is achievable, because the result of model fitting is the regression relationship between the independent variable and the dependent variable, which can explain the specific relationship between the independent variable and the dependent variable. Although, in terms of data processing, logistic regression is more complicated than other algorithms, there are ways to deal with it to meet the modeling requirements. Meanwhile, from the perspective of stability, the logistic regression algorithm is not inferior to other sub-type algorithms. On the whole, it is more appropriate to choose the method of logistic regression to build the response model.
In many predictive analysis studies, we will use linear regression methods. However, sometimes because many realistic research questions in practical applications are directed at categorical variables, for example, the target variable studied in this article is a binary variable. Whether customers will actively respond to investment and financial product marketing is predicted. There may be many other factors in the forecasting process that affect whether customers will respond to marketing. We need to study what factors are related to the probability (P) of the event, but it is difficult to directly deal with the probability P. First, the value of the probability of event occurrence is 0 ⩽ P ⩽ 1, and the relationship between P and the independent variable is non-linear and non-polynomial. Second, when P is close to 0 or 1.
At this time, it is relatively easy to choose a strictly monotonic function Z = Z (p) that handles p. We set:
The above transformation becomes log it transformation. It can be seen from the above formula that P takes the value 0 ⩽ P ⩽ 1, and the Z value range is in (- ∞ , + ∞). Therefore, when P changes slightly near p = 0 or p = 1, Z (p) becomes very sensitive, which is a monotonously changing S-shaped curve. This solves the above two problems.
When studying the probability of an event, the factors affecting him are x1, x2, ⋯ , x
n
, we set:
Among them, β0, β1, β2, ⋯ , β n is the model parameter and β0 is the intercept term.
Among them, A is the model parameter and B is the intercept term.
Then, the relationship of Z to the independent variable is linear or polynomial. It estimates the parameters by ordinary least squares method, and then establishes the non-linear relationship between the probability P of the event and the independent variables through the log it conversion of formula (1).
Therefore, the relationship between the probability p of the occurrence of the event and the independent variable is as follows,
The above formulas (5) and (6) are logistic regression equations. Among them, p in the logistic regression model represents the probability of response, (1 - p) is the probability of non-response, and
Weight of Evidence (WOE) can transform the model results of the classification model into the form of a score card. In this article, it will transform various variables into WOE, and this transformation also becomes a rough classification. In the response model, the WOE transformation of variables is not to improve the accuracy of the model, but to simplify the data processing process of the model. Because many variables are usually selected to build a response model, in order to reduce the complexity of model processing, we choose to do WOE conversion on the variables before building the response model.
WOE defines the category i of nominal variables or a segment of continuous variables after binning processing as follows:
We assume that good is a good customer, which is a customer who purchases products in response to marketing. Bad is a bad customer, which is a customer who did not respond to marketing and did not purchase a product.
Among them, we get:
And
In formula (8), Ny i represents the number of responding customers in the i-th group, Ny T represents the number of all responding customers, and py i represents the proportion of responding customers to all responding customers in the i-th group. Then, Nn i represents the number of non-responsive customers in the i-th group, Nn T represents the number of all responding customers, and pn i represents the proportion of non-responding customers in the i-th group. In formula (7), WOE is the logarithm of “the proportion of responding customers in the current grouping of all responding customers” to the “proportion of non-responding customers in the current grouping of all non-responsive customers".
Formula (7) is transformed, which can be written as:
From the transformed formula, we can see that it is expressed as the proportion of non-responding customers on the proportion of responding customers in the group, and then the logarithm is taken. Therefore, WOE can be understood as the difference between the two ratios in the comparison variable. It can be found that if the WOE value is greater, the difference between the two is greater, and then the customers in this group are more likely to respond to marketing. Conversely, the smaller the WOE value, the smaller the difference between the two, and the less likely the customers in this group will respond to marketing.
Information value IV (Information value) can be calculated according to WOE. Each group corresponds to an IV value. The calculation formula is as follows:
After getting the IV value of each variable, we can calculate the IV value of the entire variable, that is, each part can be summed up.
n is the number of variable groups.
Standardizing the scorecard is the focus and difficulty of the entire modeling process. It is implemented after the logistic regression model is established and tested. It must be based on WOE conversion. At the same time, it uses the weight of evidence of each nominal variable or grouped variable, then quantifies the variable of each grouping with scores, and converts the regression results of logistic regression into scores. In this way, the total score of each customer and the score of each variable data of the customer can be clearly obtained. Its specific conversion process is as follows:
The WOE of the i-th variable x can also be expressed as:
The X variable has q groups after WOE transformation. δi1, δi2, ⋯ , δ iq is a binary dummy variable whose value is 0 or 1. WOEi1 represents the WOE value of the first group of the i-th variable in the variables. When the value of the X variable D belongs to the second group, then the value of δ1, δ1, ⋯ , δ q is 0, 1, ⋯ , 0, only δ1 is 1, and the rest of δ is 0.
Bringing into the logistic regression model, it can be expressed as:
The above formula can be re-expressed as:
In order to further calculate the score, the corresponding response score when Odd is θ0 is set to Q0, the corresponding response score when Odd is 2θ0 is set to Q0 + QD0, and the score for doubling the ratio is QD0. Then, customer response score can be expressed by using the equation of Ubi:
We calculate
Substituting the ln (odd) calculated from equation (15) into equation (16), the score can be calculated for a certain indicator of the customer. The above is the process of making and standardizing score cards.
According to the needs of enterprise development, suitable marketing products are selected. We take the product as the marketing response target and build a customer response scoring model. Based on the customer behavior data in the training set, the response model is trained to determine whether the customer will respond to the purchase of the product. In addition, we perform predictive analysis on the customers to be predicted according to the rules of the response model and use this more accurate method to finally divide the customers into two categories. Marketers then conduct targeted marketing to precise customers who are predicted to respond to purchase products. In fact, in the modeling process, the factors that have a significant impact on the response target may be positive or negative. In daily publicity and marketing, companies can positively guide positive influence factors and avoid negative influence factors. Based on the parameter results estimated by logistic regression, variables with large parameters will have a greater impact on the customer’s response to buying the product, while variables with small parameters will have a small impact on the customer’s response to buying the product.
After the standard score conversion is completed, each branch institution appropriately adjusts the target range of product marketing according to its own marketing status and cost control and the level of customer scores, and reuses the model results, which improves the utilization rate of the model by marketers. According to the scores of each group after discrete grouping of the respective variables, it can be seen that, on the contrary, customers in the group with a low score are less likely to respond to purchase products.
This article takes the current hot mobile communication industry as an example for research. This article will analyze the competitive landscape of the telecom market in the S region from 2017 to 2019 and the current status of the mobile industry competition in the S region. First of all, this paper collects and organizes the user scale and business revenue data of the three major telecom operators from 2017 to 2019 in the S region city, as shown in Table 1 and Fig. 1:
Comparison of business revenue and user scale of the three major telecom operators
Comparison of business revenue and user scale of the three major telecom operators

Comparison of business revenue and user scale of the three major telecom operators.
Due to the reduction in communication tariffs, the revenue of telecom operators has fallen. However, in terms of user scale, S Region Mobile accounts for a large proportion of mobile users. Compared with S area telecommunications whose main business is concentrated on fixed telephones, it has obvious advantages. In the above comparison, China Unicom in S region is in the lowest position. Therefore, in the current telecommunications market in the S region, China Telecom is a strong competitor of S region Mobile.
The distribution of fixed telephone market share in S region from 2017 to 2019 is shown in Table 2 and Fig. 2. Since mobile companies are not involved in fixed telephone services, this article mainly compares the changes in the market shares of China Telecom and China Unicom. It can be seen from this table that the fixed-line telephone market in the S region is dominated by telecommunications, and China Unicom in the S region only occupies a small market share. The telecom area in S has obvious competitive advantages, but it also shows certain fluctuations. Moreover, due to the development of mobile communications and Internet services, the number of fixed telephone users in the S area is gradually decreasing.
Comparison of market share of fixed telephones occupied by the three major telecom operators (%)

Statistical diagram of comparison of market share of fixed telephones occupied by the three major telecom operators (%).
The comparison of the mobile phones of the three major telecom operators in S region and city from 2017 to 2019 is shown in Table 3 and Fig. 3. In the mobile business market, the entry of China Telecom has increased the pressure of competition between China Mobile and China Unicom. However, in the telecommunications market in the S region, the mobile service market share of the S region mobile is far ahead of the S region telecommunications and the S region Unicom. It can be seen from this table that the mobile subscribers of Telecom and Unicom in S Region are increasing year by year, but Mobile in S Region still monopolizes the mobile communications market. S area mobile occupies 70–80% of the market share of the mobile user market every year, mobile companies are still the market leader, S area telecommunications and S area together are in the state of followers.
Comparison of mobile phones of the three major telecom operators (%)

Statistical diagram of comparison of mobile phones of the three major telecom operators (%).
Table 4 and Fig. 4 compare fixed phone users and mobile phone users in S region from 2017 to 2019. With the continuous growth of mobile users in the S region and the larger growth rate, the mobile phone market has had a huge impact on the fixed phone market. However, S region Mobile occupies most of the mobile phone market share. Therefore, China Mobile shows a greater competitive advantage in the mobile phone business market.
Comparison of fixed phone users and mobile phone users

Statistical diagram of comparison of fixed phone users and mobile phone users.
ARPU is the average income per user. ARPU represents the profit created by each user owned by an operator in a certain period of time. The higher the ARPU, the better the profits and benefits of the company. From Table 5 and Fig. 5, it can be seen that the ARPU value of S area telecom and S area mobile company is higher, and the telecom company has the highest ARPU value, and its profitability is still the strongest among the three telecom operators. Therefore, the development prospects of telecom companies in S area are better. China Unicom is relatively low, mobile and telecom are still relatively high at present, and the gap between the two operators is relatively small. However, from an overall point of view, due to the downgrade of telecom tariffs, the overall ARPU value of the S region city is decreasing year by year, and consumers will benefit from it. At the same time, it also reflects the intensified competition in the telecommunications market in S region.
ARPU value of the three major telecom operators (yuan/month)

Statistical diagram of ARPU value of the three major telecom operators (yuan/month).
The MOU (minutes of usage) value in the telecommunications market refers to the average monthly talk time per household. From the MOU value, we can see the output capacity of the three major telecom operators in the S region and the popularity of telecom services. From Table 6 and Fig. 6, it can be seen that China Mobile has the largest annual MOU value, China Unicom’s MOU value is less than that of China Mobile, and China Telecom’s MOU value is the smallest. This is also related to the main business of the three major telecom operators. Mobile companies mainly operate mobile services, so their MOU value is the largest and the fastest growing. Its profitability is also the strongest. Telecom in S area and Unicom mobile business in S area started late, but they also show a gradual growth trend, which will intensify competition in the S area telecom market in the future.
MOU value of the three major telecom operators (minutes)

Statistical diagram of the MOU value of the three major telecom operators (minutes).
In summary, in the S region telecom market, China Mobile’s main competitor is S region telecom. Telecom in the S region has shown strong advantages in terms of corporate profitability and revenue growth, and its share in the mobile market is growing. The prospects for corporate development are better, which poses a greater threat to the development of mobile in the S region. However, S region mobile also has an advantage in profitability and user scale. Especially in the field of mobile market, whether in terms of market share, user scale or brand influence, S region mobile is in an advantageous position with huge development potential.
In project management, organization construction is of great significance to project team management and project implementation. In response to the problems existing in the mobile customer relationship marketing project in the S area, the S area mobile should strengthen the organization and management, especially the project organization and management.
Managers should change their concepts, strengthen the learning and understanding of project management, clarify the operation process and specific content of project management, and grasp the progress of project implementation in time, communicate the content and progress of the project in advance, and increase support for the project. Secondly, in terms of management, the manager must give certain management power and work authority to the project team, give full trust to the team leader, and clarify the powers and responsibilities of the work team to facilitate the project team’s resource scheduling and work development, thereby promoting the smooth implementation of the project. Moreover, managers need to establish employee project performance appraisal files, evaluate the work status of employees through project control and project evaluation, and link employee performance to wages. In addition, managers need to set up corresponding rewards and punishments to mobilize the enthusiasm and initiative of employees, urge employees to improve their work ability, and improve the vitality of the project team.
The key to the success of this project lies in the implementation of marketing strategies and service levels. S Region Mobile should strengthen the training of employees’ marketing knowledge and skills, and service levels to ensure the successful implementation of sales strategies and provide customers with high-quality services to improve customer satisfaction and enhance customer loyalty.
The complete project support system is an important guarantee for the implementation of project management. This article proposes corresponding safeguard measures for the system problems existing in the mobile customer relationship marketing project in S area. S Region Mobile should introduce an advanced information management system to collect, track and organize information on the overall status of project operation, so as to form a complete project summary report. Moreover, it needs to implement automated management of the project through the establishment and improvement of an information management system. In addition, the real-time update, reading and sharing of information through the information management system can reduce errors and the complicated work of various departments, facilitate information communication and sharing, and enable managers to grasp the status of project implementation in time, and provide convenience for project decision-making.
In order to maximize the effectiveness of customer information, S Region Mobile needs to establish a database based on the CRM system, unify the information of various departments, and merge and sort objective data that reflects the current status of project implementation. Through effective data and information sharing, S Region Mobile can keep abreast of customer needs and consumption trends in a timely manner, provide consumers with satisfactory humanized services, and let consumers experience the attention and care of the enterprise. Moreover, it can strengthen the connection between consumers and enterprises and improve customer loyalty.
Conclusion
With the rapid development and progress of the national economy and science and technology, especially the popularization of computers and the Internet, marketing methods have become more diversified, and customer needs have continued to develop in the direction of individualization and diversification. To realize the “customer-centric” development strategy, customer relationship management has become very important. It can not only manage bank customers hierarchically, and position the market for different customer levels, but also integrate the relationship between suppliers and customers in an all-round way, maintain existing customers, tap potential customers, and promote customer stickiness and contribution. This will be the basis of all supplier marketing activities. This article proposes to build a customer response scoring model for supplier precision marketing. According to the response customer implementation list predicted by the response scoring model, the work of predicting response customers is transferred to the back-end marketing analysts, which reduces the work pressure of front-end marketers and improves the accuracy of response customer forecasts while also rationally using human resources.
Footnotes
Acknowledgments
Supported by Social Science Foundation of Liaocheng University (321051615).
