Abstract
The optimal development of third-party payment platforms presents some very complex problems. One of the key issues is to identify the most promising potential customers. Hence, customer mining has come to prominence in financial research. This paper describes a way of modeling a third-party payment system in the context of a rough complex network, underpinned by rough set theory. The study of knowledge discovery on rough complex networks provides a quantitative and actionable method, which can be used to mine potential customers in rough complex networks. In addition to developing a way of analyzing data on a third-party payment platform, the paper uncovers a new application area for rough set theory, pointing the way to further utilization of this technology.
Introduction
A third party payment system provides an intermediary between a set of businesses selling a range of products and a set of customers who browse, collect and purchase the products. The elements of this system form a highly complex and irregular network with some nodes connected to many others and each connection representing a large quantity of information. Analysis of the whole network, for example to select particular groups of customers, requires the processing of huge amounts of data, which may be incomplete and where different customers might have the same characteristics, which makes them indistinguishable from each other. In this paper, we propose the use of rough set theory to reduce the amount of information to process as well as to deal effectively with incomplete and indistinguishable data. This is done through an example related to marketing to selected customers in what we call a third-party payment rough complex network.
In such a network, a platform manager, aiming to attract customers, will often send promotional information to all registered members. However, there is considerable uncertainty as to who are the most appropriate customers for which type of advertisement. Thus, the platform manager might hope to mine a group of potential customers, for which previous attempts at marketing have achieved some success. However, the selection of strategies for identifying customers is often based on subjective experience and judgment rather than objective analysis.
In recent years, along with the rapid development of the computer and increasing knowledge of computer networks, the scientific community has found examples of real networks such as random networks and irregular networks which are known as “complex networks” [7, 22]. The literature [6, 15] notes that complex networks typically have the features of evolution, dynamic complexity, diversity and multi-node complex integration. Complex networks can also be defined by a range of properties such as self-organization, self-similarity, small world properties and are scale-free. Although complex networks are now quite well known, scientists cannot yet give a more precise definition [16, 21]. In recent years, a large number of articles have been published in leading international journals, which show that complex networks are a growing area for research [2, 19]. This research has so far focused on: the structures, properties and evolution of the networks; physical transmission on the networks and using the structure of the networks for the control and optimization of systems. Other studies include the modeling of complex networks, synchronization, control, game theory, communications, the importance of indexing the nodes, transmission dynamics and robustness analysis [2, 24]. Despite this interest, there is currently very little literature on knowledge discovery in rough complex networks. The literature described in [3, 23] shows that web-based data mining and knowledge representation, as a new theme and a new research field within data mining and data warehousing, has not yet formed a mature theory and technology.
Since the information in a third-party payment platform is generated in the transactions and the transaction medium consists of a complex information network involving uncertainties and indistinguishability, this paper constructs a third-party payment rough complex network. This is done by studying the statistical characteristics of rough complex networks to build a knowledge discovery model for the mining of potential customers.
The paper is structured as follows. Section 2 introduces the principles of rough set theory and defines a range of terms that are used in Section 3 to develop an illustrative model of a third-party payment system, based on rough set theory. Section 4 reviews the effectiveness of the model and concludes by indicating how the paper contributes to knowledge of rough set theory applied to complex networks.
Rough set theory and complex networks
Overview of rough set theory
Rough set theory is a new and increasingly popular approach to data analysis and data mining [12]. In contrast to conventional set theory where membership of a set can be either true or false, and membership of a fuzzy set is expressed as a probability, members of a rough set are defined by being indistinguishable in terms of the currently known values of their attributes. Elements are similar without necessarily being identical. Hence, this approach is well suited to dealing with situations where there is incomplete information. Objects for which known data makes them certain to belong to the set form the “lower approximation” to the set, and those that may belong to the set form its “upper approximation”. The boundary region between these is a crisp set [1]. In the example of this paper, the information is derived from the complex network of the third-party trading platform as described in the next section.
Formal introduction to rough complex networks
We model trading on the third-party payment platform as a rough complex network composed of businesses, goods for sale and customers. The businesses are represented as nodes and the URL links are the edges. Customers who visit a node represent uncertainty and the node connecting each one is indistinguishable, so the third-party payment platform trading networks are analogous to rough complex networks. This section defines several characteristics of a rough complex network which are used later to develop the third party payment model.
In the third-party payment platform based on rough networks, businesses and commodities constitute the knowledge base U, X ⊂ U. R is a decision relationship on U where R = {buy, collect, browse}.
The network’s degree distribution p (k) is defined as the proportion of nodes in the network that have a degree that equals k.
In which N represents the total number of the network nodes; ω i is the weight of node i; k i is the degree of each node i. This can overcome the problems described in reference [6].
The number of adjacent edges
Let
Let
In rough complex networks the search for key nodes corresponds to a search for the shortest path through the network that connects the nodes.
A scatter plot of the degree probability distribution in double logarithmic coordinates showed that the points were located near a straight line with a negative slope. Hence, it appears to be reasonable to represent the degree probability distribution as a negative exponential power function.
A significance test of the regression analysis showed that the regression model and equations were significant. However, the goodness of fit of the model is not precise due to a lack of data points. Nevertheless, it is apparent that the third-party payment rough complex network lower approximation follows a power law distribution; as does the upper approximation.
Literature [6] pointed out that the higher the γ of the power-law distribution, the more uneven is the degree distribution of the network, and the nodes having large degrees will be very prominent. Statistical analysis shows that when the average path and the actual average path are similar, the network search will be more efficient. However, if the nodes have a weighted degree that is much larger than the average of the network, this will greatly increase the search space and reduce the data search speed. Therefore, if the nodes that have a weighted degree that is greater than the weighted average of the network are used as the initial mining customers, this will increase the workload and operating costs of the platform. Therefore, the problem of mining customers is one of the key issues to address in improving the effectiveness of the platform.
The definitions above are now used to develop an illustrative practical application of rough set theory to a third party trading platform. Our model is built upon the following characteristics: There are a total of m businesses on some trading platform, every business sells V
i
similar commodities; Customers browsing a commodity, tend to choose commodities within the same business with a high probability. There is a smaller probability of selecting commodities from a different business. Businesses can share information with each other; When customers visit the commodities page, they may browse, buy or collect with a certain probability; Before they buy, customers first collect commodities then choose whether or not to buy them;
We collected data on Apple, Meizu, Xiaomi, Huawei, Samsung and other brands for 4 businesses selling 78 commodities. This is represented in Fig. 1. The third-party payment rough complex network is shown in Fig. 1 where m = 4, V i (i = 1, 2, 3, 4) takes an integer from 20–30. Here, the small nodes represent commodities for sale by the businesses. Big nodes are the four businesses. The small nodes connected to the businesses have been purchased by customers; they represent the lower approximation networks. The remaining commodities are collected but not bought by customers; they represent the boundary domain. The lower approximation networks and the boundary domain make up the upper approximation network. There are some edges not connected to nodes, which represent customers browsing and not collecting commodities. We obtained each node’s price, sales, collections popularity index, upper approximation degree, lower approximation degree and node pricing weight. The lower approximation rough networks degree is shown in Table 1. The upper approximation rough networks degree of nodes and weight of nodes is shown in Table 2.

The third-party payment rough complex network.
Probability distribution of lower approximation degree
Note: The
Nodes and weight of the third-party payment upper approximation rough complex network
Based on nonlinear regression analysis, from Table 1, we determine the lower approximation degree distribution as:
From Tables 2 and 3, we determine the upper approximation degree distribution as:
Probability distribution of upper approximation degree
Then we follow the steps below to complete the rough complex networks knowledge discovery process:
Node information of upper approximation rough complex networks
In this example, the key nodes = {Samsung Note 4, Xiaominote, HuaweiP8, Huawei Smooth play Mobile, Kupai8270L, kupaiF1 plus, vivo x5 F, Huaweip7, Rongyao6, Meizumeilan, Meizum5 China Unicom version, Samsung 3 s, KupaiF1, Huawei6, HongmiNote, Rongyao4x, Samsung 4 s, HuaweiGX1, Huaweip6, Xiaomi4, Appleip4, MeizuM5, Ipad mini, Samsung 5 s, Lianxiang s898T, Hongmi3}.
For example, in Table 4, information on Meizumeilan was collected to illustrate the efficiency of the algorithm from the network nodes. It can be seen that monthly sales for Meizumeilan totaled 9,680 items and the collection’s popularity index is 11,681, which is relatively large. Now we can randomly collect 100 customers as a group, then find sub-groups to be stored in Star customer.
In the network, search the buyers as shown in Fig. 2:
According to this schematic view, we can obtain the buyer’s name and platform managers aggregated purchase total over four periods. Calculate the total value of each Star customer over four buy and collect periods.
For example, Table 5 shows the account lixia1210, over four year’s quarterly consumption.

Meizu Meilan buyers schematic diagram.
Lixia1210 four-year quarterly consumer records (Unit: hundred Yuan)
Because each customer’s purchase history data is only available to the platform manager and the customer themselves, here we can only show the operation of the process.
According to the time series analysis, we calculate b0 and b1 as follows:
In the formula: T t is the time series value for period t; n is the number of periods.
For example, the linear trend component of customer lixia1210 is:
Thus, the projected customer quarterly trend values in the fifth year are shown in Table 6.
Quarterly trend values of lixia1210 in the fifth year
For example, the base price index for customer lixia1210
This example can only illustrate the effectiveness of the method. In a real network, data collection would have to be supported by the platform managers. In that case, it would be possible to realize the whole process of knowledge discovery. Finally the overall algorithm computational complexity can be evaluated using: O (n1 + n2 + (c + 2d) · (min {100, n2}) 2).
The problem of mining potential customers is a very important research topic in complex networks. Current research has made some progress but is still not a well formed mature theory and technology. Furthermore, the existing literature suggests that few people are involved in research on knowledge discovery in rough complex networks. This paper proposes a third-party payment system based on rough complex networks and develops a third-party payment platform knowledge discovery model to solve the problem of mining potential customers.
The paper first defines a number of characteristics of rough complex networks. We show that the lower and upper approximation degrees obey a power law distribution. Then we established the knowledge discovery model of the third-party payment platform customer mining according to the characteristics of the network power-law distribution. The numerical simulation shows that the proposed customer mining method is effective and feasible. Compared with current methods for third-party payment platform operations management, the knowledge discovery model in this paper is superior in the following three aspects: (1) selecting key nodes with relatively large degrees was more efficient than randomly selecting the key nodes. (2) The Star customer set algorithm applied a method of attribute reduction and decision rule extraction for data preprocessing to reduce the search space and improve the data search speed and accuracy of the results. (3) Using time series analysis to find potential customers is underpinned by a rigorous mathematical framework and is adapted to the needs of dynamic knowledge system update.
Therefore, the knowledge discovery methods developed in the paper meet the requirements of a third-party payment platform and are a feasible approach to solving the problem of mining for potential customers. In addition to developing a new way of analyzing data on a third-party payment platform, the paper uncovers a new application area for rough set theory, pointing the way to further utilization of this technology.
Footnotes
Acknowledgments
This work was supported by the [Shaanxi Provincial Natural Science Basic Research Program (Key, 2015JZ010); [Shaanxi Provincial Education Department science study plan project (16JK1369)]; [Xi’an science & technology association decision-making advisory issue (201517)].
