Abstract
The frequent trading activities of electronic commerce make the online transaction volume of Chinese enterprises increase year by year, but many enterprises still follow the traditional marketing strategy, which is not conducive to the long-term development of enterprises. Online precision marketing system model based on big data was built, Hadoop + MapReduce precision marketing model platform was implemented, all the data were stored in a distributed storage system, data mining technology was used to deal with it and provide the basis for enterprise decision making. China’s H group was studied. The “user portrait database” and the corresponding E-R map were constructed. The height subdivision factor with strong correlation was selected for cluster analysis, and the product was subdivided by cluster analysis. This study has certain reference significance for the collection and mining of online data of enterprises in our country and contributes to the long-term healthy development of the enterprise.
Introduction
In recent years, the development of Internet technology has led to e-commerce trade activities, people gradually formed the habit of Internet transactions, which contributed to the vigorous development of online transactions in China. In this process, although many companies are using the Internet to sell, but its marketing model and the traditional marketing model is not much difference, its adoption is a wide range of product coverage, no targeted marketing recommendations for quality customers [1]. Traditional extensive model has been difficult to adapt to the characteristics of the modern market, if this strategy model is adopted for a long time, too many product categories will increase the product design and production costs, reduce product profit margins, it is difficult to retain and develop a loyal customer base, which is not conducive to the long-term healthy development of enterprises. Internet platform transaction process will produce a lot of consumer data, these data always around the customers and products, how to collect and deal with these data, and to provide the basis for enterprise decision-making and marketing strategy through large data mining technology is the key to realize modern precision marketing [2]. Therefore, it is necessary to change the traditional marketing model, to achieve one-on-one precision marketing, to promote advertising, communicate with customers and focus on customers, so as to provide consumers with products or services that meet their needs at the right time, place and channel [3]. To successfully achieve accurate marketing, it is necessary to carry out the information transmission, to analyze the psychological needs of clients and behavior characteristics, so as to develop a more accurate marketing tool and program.
Due to the continuous development of online Internet technology, the data obtained from the network, both species and quantities, are very complex, it is necessary to improve the data storage and processing technology, so as to give full play to the use of data value [4]. The data processing technology based on Hadoop framework can realize the storage of large amounts of data and successfully carry out data mining, so as to provide the basis for enterprise decision-making. Through the construction of accurate marketing model and large data marketing platform, the customer’s personalized precision service marketing was realized, which has a good application value to our online precision marketing. Through the application research of big data technology in H group of our country, after constructing the corresponding online precision marketing platform, the enterprise related big data was excavated to provide targeted and precise marketing strategy. The construction of the precise marketing system platform based on large data provides some reference for the development of online marketing strategy of Chinese enterprises, which is beneficial to reduce the cost of enterprises and enhance the core competitiveness of enterprises, and is conducive to the healthy and long-term development of enterprises.
Big data theory and precise marketing theory
Research on large data theory and its application
With the continuous development of Internet technology, more and more information exchange has come into being, and big data has been derived from it, which implements extreme information management and processing of one or more dimensions. By collecting data from multiple platforms of the Internet, the obtained data is analyzed and processed [5]. Big data is used to analyze the specific application, its specific steps are to collect and control information, and then to use appropriate mathematical algorithms for observation and prediction. Finally, it is applied to the optimization and decision-making of specific business, as shown in Fig. 1. Compared with the traditional information processing, large data has many characteristics that traditional information does not have. With the human society entering the information age, the data shows a geometric rate increase every day, therefore, the volume is huge and contains a great variety of data, such as text, video, pictures and other structured and unstructured data [6]. The data are processed by using a large data technology, which can realize the fast and accurate analysis by using various analysis tools. The analysis of the internal correlations of different types of data can achieve very high value returns [7].

Big data application steps.

Big data based on Internet technology.

User life cycle model.
At present, large data technology has been applied to various industries, and greatly promoted the socio-economic development, and even to a certain extent, changed people’s way of production [8]. The mining of large data can identify and analyze the laws and patterns related to human social activities, and provide scientific basis for people’s decision support. In particular, it is more common for Internet-related industry applications. In the field of business intelligence, for example, large data technology can be used to develop a complete business decision-making program based on the results of large data analysis [9]. The most successful case is the Starbucks in the global financial crisis of 2008, it can find the suitable position for store operation by data mining, and carry out new business and rival its competitors at one stroke [10]. In addition, in the field of public services, large data can also help the government to make macroeconomic decisions, including earthquake relief and urban management and other aspects. The first industry area for big data applications is marketing. Enterprises carry out targeted advertising to users and product recommendations through the analysis and research of network user behavior information data and data mining technology [11].
For marketing purposes, in order to obtain effective data information, it is necessary to analyze fuzzy, random, large and noisy random data, thus mining potential information and value. Commonly used data mining techniques include clustering analysis, classification algorithms and correlation algorithms [12]. Taking the cluster analysis algorithm as an example, the idea is to use the consumer’s product behavior and related characteristics to carry out the consumer classification. After analyzing the customer background, the customer’s buying behavior is forecasted. Let the dataset that needs to be analyzed is A, A = {x
i
|x
i
∈ Rm, i ∈ Z, 1 ≤ i ≤ n}, and the matrix M is constructed:
In the matrix M, the i -th row and the k -th column refer to the clustering analysis result and the classification quantity of the data x i respectively. It is rule Γ that determines the result of cluster analysis, which looks for the larger similarity between the data, rather than the data with small similarity.
The similarity is also a critical measure of the interrelationships of data. The similarity is analyzed by distance, and the Minkowski distance between the samples in the data set A can be represented by the following formula:
In the formula (2), q is a positive integer, and its different values determine the different distance types: when q = 1, a ij (1) is the absolute distance; when q = 1, a ij (2) is the European distance; when q → ∞, a ij (∞) is Sheffer distance.
Data can be classified and analyzed and predicted by modeling. A large number of data are used to the classification of customer groups, so as to analyze their behavior and forecast, to specify a targeted marketing strategy, and forecast the easy-to-eat consumers to take the necessary steps to develop them into a loyal consumer of their own [13]. Commonly used data mining classification algorithm has decision tree algorithm, neural network and so on.
Whether it is online users or offline users, for enterprises, users of different types and different requirements will change the degree of acceptance of enterprises and products, thus forming a relatively complete user life cycle [14]. According to the development direction of different users, it can be classified into three directions: (1) after the acquisition of users, the industry and enterprises are very recognized, but users can meet their product needs, and gradually developed into a loyal enterprise user. (2) If the enterprise user is lost, some users will return successfully through user retention. (3) Some users failed to retain and eventually lost. The detailed user life cycle model is shown in Fig. 1, which are loyal users, user wins and user churn. No matter what type of user, it will have three stages of development, namely, user contact, user access and user development. User exposure phase is the user’s initial understanding of the company’s products and services. As the purchase has not occurred, for enterprises, enterprises need to analyze the potential user demand, so as to gain insight into the user demand. By analyzing the user’s access and purchase of recorded data, the enterprise identifies the user’s image and determines whether it can become a consumer base. Enterprises can carry out targeted marketing promotion for potential users, and provide customers with the products they need, they can be converted into their own long-term customers. In the development phase of the user, the most critical is to change the user’s purchase intention into buying behavior. Through the purchase history of users, the enterprise carries on the product relevance analysis, further enhances the user purchase transformation rate.
Compared with foreign developed countries, China’s research on precision marketing started relatively late, but with the reform and opening up, the economy is developing at a faster rate. According to the existing research, the implementation of precision marketing can be divided into three categories, enterprise database technology, Internet technology, third party channels. The driving force behind the development of precision marketing is the innovation of technical means and marketing methods. The most critical part is the Internet-based precision marketing. Market segmentation can provide customers with accurate marketing methods and fine management services, and achieve competitive advantages and expected revenue. In the advertising, the consumer’s inertia thinking can be used to carry out accurate identification and delivery, and to customize the personalized products and services. For marketing development, precision marketing is to obtain more customers, improve marketing efficiency, and ultimately create more corporate profits. Its core lies in the “precision”, that is, big data technology can obtain more targeted customer object, accurate push advertising, produce precisely to meet customer needs. With the changing economic development and the changing market environment, the marketing methods and ideas in different times are changing and need to be adjusted with the times.
Precise marketing based on big data has quantifiable features, and can accurately market positioning, which has a big difference from the traditional marketing positioning, can avoid the blind in advertising, and at least 50% of advertising waste can be avoided. The use of information technology in IT system can achieve good communication between enterprises and customers, and understand the individual needs of customers in real time, establish a close communication channel, get a loyal and stable customer base, and realize the fan economy to ensure the long-term healthy development of the enterprise. The use of modern advanced Internet technology and database technology can measure and control the input effect and output effect of enterprise marketing strategies, avoid high cost of advertising, effectively control the cost of business. Precise marketing of big data allows companies to better understand users. Users can better communicate with the enterprise, which gradually evolved from the traditional media orientation into audience-oriented. In determining the product, the enterprise’s main products carry out special marketing activities. In this case, when focusing on the media, users can accept different advertising content, so as to carry out targeted marketing results.
Construction of precise marketing system based on large data
On - line precision marketing model construction
In recent years, the development of Internet technology has promoted the rapid development of e-commerce, and online trade activities have a very important impact on the development of China’s economy. The first is to build a precise online marketing model, to optimize and improve big data technology on the basis of traditional marketing model. The model of the big data accurate marketing process is shown in Fig. 4. The real estate enterprise collects and collects all kinds of market information and feedback information from the sales personnel through the sales center and the online platform, further collects market information through sales and market research, and divides it into structured and semi-structured precision according to the different information. All the information is transmitted to the customer database, all the information collected to the customer database after the analysis of information and data mining. After all the collected information is transferred to the customer database, information analysis and data mining are carried out. The information collection channel can be either a free network platform or a platform for cooperation with third-party partners. Most of the information is based on online customer behavior information, which contains the user’s personal information, stay time, price range and other information. The appropriate algorithm is chosen to carry out data mining on information, which can analyze the customer’s consumer behavior and purchase motivation to a certain extent, so as to carry out targeted information push and service.

Based on large data precision marketing model.
Data mining based on Internet technology must build an accurate market segmentation model, that is, according to consumers, products and purchasing behavior these three areas, subdivision is carried out. The specific model is shown in Fig. 5. In particular, the consumer’s own individual characteristics and specific consumer behavior have a high degree of correlation, such as consumer gender, occupation, age, annual income and home address. The specific product style, quality, color and brand will also directly affect the consumer’s specific consumer behavior, which has a greater relevance to whether a consumer can become a loyal customer group. The consumer’s buying behavior includes the number of product collections and the specific evaluation of the product after purchase. For the online network platform consumer behavior, before buying behavior on Tmall, Jingdong, Taobao and other online platforms, consumers will browse and compare products with the intention to purchase. There is a significant correlation between the time of its stay and the time of purchase and the buying behavior of the consumer. Consumers’ individual characteristics, products and buying actors are analyzed. According to consumer spending habits and individual needs, products that meet the needs of consumers can produce specific consumer behavior.

Market segmentation model of precision marketing.
The implementation of precision marketing on the line requires a corresponding market segmentation model. Precision marketing system platform collects and digs the corresponding data, in this way, the technology and production costs can be reduced as much as possible. At present, there are cloud computing, database and Hadoop platform, which can play a greater value in the data after the depth of data mining. For enterprises, the data is very important strategic resources. The analysis of the data will directly determine the business model and provide a theoretical basis for business decision-making. In addition to the most basic consumer segmentation, the user needs to personalize the characterization. The segmentation factor can describe the consumers accurately, so as to realize the precision marketing. The traditional SQL-server database has been difficult to meet the technical requirements of large data. The more efficient Hbase can better manage and deal with the data, so it is convenient for enterprises to query and mine data. The user portrait part can realize the accurate mining of data, and realize the coordination management of e of various supply chain departments from industrial design to production, and use the model for marketing and assess customer loyalty, so as to avoid the loss of customers. Open source Hadoop basic platform based on Apache builds a large open source platform for large data platform, and puts forward the corresponding precision marketing system platform, as shown in Fig. 6.

Construction of precision marketing system platform based on big data.
For large data, one of the most critical features is diversity, whether it is the source of the data or the structure of the data, it is diverse, and data needs to be pre processed to make it possible to provide information for business decisions. Data collection includes Web and social media click stream and related social media data, and can understand the consumer’s intention to buy and its emotions on the business through the consumer’s click habits and comments. The data can be obtained through the web page information client program, which is the reptile technology. The program is used to search the URL in the Internet and determine whether it conforms to the URL rules in the Web. If it is satisfied, the corresponding address is fetched from the page and the XML document is analyzed concretely. In addition, companies also have a large number of their own structured data, which is mainly the data produced by e-commerce exchanges according to the characteristics of online transactions. These data are stored in their own data warehouse after the collection. These data are structured data, which is an important data source for large data processing.
The collected data will be stored in a distributed storage system, that is, a distributed file system in a large HDFS cluster built by a lower cost common machine. The system uses a master-slave structure, the main node is NameNode, and the slave node is DataNode. Both nodes are able to run through a Java program on a regular computer. After the data is collected, the data is written and read through HDFS, and the access control of the file can be realized. Because of its unique structural features, the system can handle extremely large amounts of data, and achieve great system throughput. In the calculation, MapReduce can be used to carry out the specific calculation and data processing, which is Java function based on Map and Reduce and a distributed parallel computing software framework that runs through a common machine cluster. When large-scale data set processing is required, HDFS and MapReduce can be used to process and store it in a non relational Hbase database, so as to achieve efficient and fast processing of semi-structured and structured data, and control enterprise cost to a certain extent. In the traditional structured data processing, the ETL tool can be used for preliminary processing and then stored in the SQL Server database to provide conditions for data query applications.
Precision marketing platform based on big data was used to carry out the H group’s online marketing strategy research. H group is China’s famous clothing brand company, and tradition is based on online marketing. In recent years, with the development of e-commerce, online marketing has gradually developed, and the amount of online marketing has increased year by year. From 2010, the average annual growth rate reached 19.4%. The group’s current marketing model has the following problems: (1) Most of the collection of online customer data was based on Taobao, Jingdong and other platforms, and did not built a comprehensive database of its own; (2) the marketing model still adopts the traditional marketing strategy, and did not realized the accurate marketing to the customer, so the customer loyalty is not high; (3) product design, production and the market is not consistent, which affects the product sales and profits, and is not conducive to long-term development of enterprises to a certain extent.
Aiming at the problems existing in H group and combining with the characteristics of the company, big data platform based on Hadoop + MapReduce was constructed. Online data collection, processing and application were implemented. At present, the main online trading platform is Tmall and Jingdong, so the data are structured data, which can be processed simply and imported into SQL-Server, so that the data can be stored and processed rapidly. Firstly, the “user portrait” database was built, E-R diagram shown in Fig. 7.

E-R diagram of user portrait database.
Through the initial processing and analysis of the online sales data of the H group, the data was stored in SQL-Server. The corresponding data table form and the rule table were constructed. The product table was taken as an example, which is shown in Table 1 and includes the product name, class, price, number, type and color of the six aspects, which was defined and determined on the basis of product identification information.
User profile database product list
As the H Group’s user portrait database contains more types of data, so there are more specific items to be analyzed. In order to avoid consuming too much time, height subdivision factors that affect the consumer buying behavior and have a strong correlation with the product were carried out specific Matlab data mining, the specific cluster analysis results in Table 2.
Cluster analysis results
As can be seen from Table 2, clustering cases of class 3, 8, 9, and 10 are relatively few. Their corresponding clustering centers are 85, 70, 60, and 60 cm respectively. According to further analysis, it can be seen that most of the clothes are mainly concentrated in the 110 cm– 160 cm. When performing specific product classifications, 130 cm is used as a sub critical, <130 cm, every 10 cm is divided into a file, ≥130 cm, every 5 cm is divided into a file. This kind of product classification after using large data can reduce the return of goods, and greatly reduce production costs.
In order to achieve online precision marketing, it is necessary to collect and tap the corresponding data in the precision marketing system platform through the corresponding market segmentation model, so as to minimize technology and production costs. In the open source Hadoop infrastructure platform based on Apache, the open source base big data platform was constructed, the corresponding precision marketing system platform was proposed, the collected data was stored in the distributed storage system, HDFS and MapReduce were used for data processing. In order to further develop the application of large data precision marketing platform, H group’s corresponding online marketing strategy was studied. The characteristics of the company were combined, big data platform based on the Hadoop + MapReduce was built by the H group, and “user portrait” database E-R map was drawn. The data sheet form and the rule table of the product form were determined. The factor of height segmentation with strong product correlation was carried out Matlab data mining. Through the analysis of the results, the product was adequately subdivided, and 130 cm was used as a sub critical. However, only the height of the product was selected to be analyzed in this study, so the data analysis was not comprehensive enough.
