Financial resource integration algorithm of virtual enterprise in big data environment

Abstract

The current resource integration algorithm lacks the consideration of users’ needs, which can cause high violation of service-level agreement and poor data quality after integration. It affects the energy consumption and service quality of data center. To address this problem, a financial resource integration algorithm of virtual enterprise based on improved artificial bee colony in big data environment is proposed in this paper. The improved PageRank algorithm is used to extract the financial resource of virtual enterprise. The extracted resource is transformed. From the unified data resource centralization after transformation, service resources that satisfy users’ needs and constraints are selected and combined. An improved artificial bee colony algorithm is applied to dynamically integrate service resources for different needs. Experimental results show that the proposed algorithm can effectively reduce the energy consumption of the data center, improve the data quality and user service satisfaction. The advantages and feasibility of the proposed algorithm in the integration of virtual enterprise financial resources under the big data environment are verified.

Keywords

Big data environment virtual resource enterprise financial resource resource integration PageRank algorithm

1 Introduction

With the rapid development of social network, mobile Internet, Internet of things and so on, data is growing at an unprecedented speed. How to manage and utilize these data resources has become a hot issue in the present research [1]. In big data environment, a large number of Internet users will use cloud services to generate a lot of virtual data. These data reflect the activities of the daily operation of the virtual enterprise, record a large number of business interaction and financial data, and form a pool of data resources with unlimited potential for development [2]. In the era of big data information, the status of data resource is as important as material resource and energy resource, and has become an important strategic resource of modern enterprises. As the key element of modern enterprise management, data resource is a new understanding and high generalization of the state of data from the point of view of resource [3]. How to integrate and reorganize massive data resources becomes the key to the successful utilization and sharing of data resources. Building a complete financial resource integration mechanism for virtual enterprise has important theoretical and practical significance to enhance the decision-making ability and overall competitiveness of enterprise [4].

For resource integration in large data environment, a virtual resource integration algorithm based on virtual cluster online migration is proposed in the literature [5]. In this algorithm, first the system is modeled, and the integration of virtualization resources under the big data environment is described. The area division method is used to classify heterogeneous resources to initially reduce the scale and difficulty of solving problems. Based on that, an FFD_grp algorithm for virtual cluster migration in isomorphic subdomain is proposed. In the literature [6], a resource integration algorithm based on phase space reconstruction is proposed. According to mutual information method and Cao method, phase delay time and embedding dimension are calculated, separately. Based on information entropy, an adaptive weighted fusion estimation method is proposed to improve the fusion of objective function. And the social cognition optimization algorithm is used to determine the weight coefficient of each information source and achieve massive data integration. In the literature [6], a resource integration algorithm based on deep learning model is proposed. In this algorithm, the feature extraction model CNNM is trained based at the sink node. Each terminal node extracts raw data features through CNNM, and sends the fused data to sink nodes. These algorithms lack consideration of users’ needs, easily cause high service level violation and poor quality of the integrated data, and affect the energy consumption and service quality of the data center.

For the above problems, a financial resource integration algorithm of virtual enterprise based on improved artificial bee colony in big data environment is proposed in this paper. The structure of the proposed algorithm is as follows.

The financial resource acquisition mechanism of virtual enterprise is divided into data resource extraction mechanism, transformation mechanism, and clustering mechanism respectively. It aims to bring together a large number of chaotic, disordered, heterogeneous financial resources of virtual enterprise, in order to orderly store financial resources in a virtual enterprise financial platform.

The relationship between financial resources of virtual enterprise is mined and analyzed. The improved artificial bee colony algorithm is proposed to dynamically integrate the financial resources of virtual enterprises to meet the dynamic needs of resource users in real-time.

Experimental results and analysis. Experiments verify the superiority and feasibility of the proposed algorithm in the financial resource integration of virtual enterprises in big data environment. The improvement of the current algorithms is also discussed.

2 Material and methods

2.1 Resource extraction

For the massive and noise characteristics of the virtual enterprise financial resources and its origin from different platforms, the data resource extraction mechanism of virtual enterprise is designed to provide data sources for financial resource transformation.

Assume D = {d₁, …, d_n} is the original dataset of virtual enterprise financial resources. Representation coefficient λ is the degree of mutual representation of two data in D. Whether data d_i can represent data d_j depends on the similarity and the given representation coefficient λ. The similarity of data d_i and data d_j is calculated by using $sim (d_{i}, d_{j}) = \frac{\sum_{k = 1}^{n} w_{k} (d_{i}) \times w_{k} (d_{j})}{\sqrt{(\sum_{k = 1}^{n} w_{k}^{2} (d_{i}) \times (\sum_{k = 1}^{n} w_{k}^{2} (d_{j}))}}$ (1)

If sim (d_i, d_j) ≥ λ, d_i can represent d_j on the λ degree, denoted as rep_λ (d_i, d_j) =1. If sim (d_i, d_j) < λ, d_i cannot represent d_j on the λ degree, denoted as rep_λ (d_i, d_j) =0. Representation set is the dataset which can represent the data d_i in the original dataset D under the given representation coefficient. For given d_i and λ, representation set is $D_{i}^{λ} = {d_{k} | sim (d_{i}, d_{k}) \geq λ, d_{k} \in D}$ .

In big data environment, representative data resources extracted from virtual enterprise financial data need to reflect the vast majority of the original data set, and the content redundancy of the dataset itself is as small as possible. Original dataset and representative dataset meet the following relationship. The representative data set R covers the content of the original dataset D, and the similarity of R with D is the largest. The information redundancy in the representative dataset R is minimal, that is, the similarity between the information in the R is small enough. The extraction process of representative data resources is described as

$\begin{matrix} Finding R s . t . \\ {\begin{matrix} R \subseteq D \\ max (data_cov erage (R, D)); \\ Maximum coverage of the original dataset \\ min (data_red 1 pt undancy (R, D)); \\ Minimum redundancy of representative dataset \end{matrix} \end{matrix}$ (2)

In finding the optimal representative dataset of virtual enterprise financial resources under the big data environment, the global optimal solution is needed to be obtained. The traditional greedy algorithm [8] cannot consider the overall optimization, which is for the local optimal solution, and the result is not good. In this paper, a heuristic algorithm based on PageRank algorithm [9] is proposed to find representative dataset. The idea of this algorithm is as follows.

An initial weight is assigned to each node, that is, PageRank value. The PageRank value of the node is denoted as P (i). Assume there is a directed edge from the node v_i to v_j, which is considered that the node v_i cast a vote for the node v_j. C (i) represents the number of directed edges from the node i, then the contribution of the node v_i to the node v_j is P (i)/C (i). In each round, the data in the original dataset D votes for all the representative sets corresponding to the data and the highest number of votes is added to the representative data set R. The steps of representative information extraction based on PageRank are as follows.

Calculating representation set. According to the original dataset D = {d₁, …, d_n}, the similarity matrix M is calculated. Then with the representation coefficient λ, the similarity larger than and equal to λ in M is set to 1 and that smaller than λ is set to 0 to obtain the similarity matrix M_λ. The representation set $D_{i}^{λ}$ of all λ is obtained according to the similarity matrix M_λ.

Counting the votes of the representative set. Define the initial PageRank of each data d in D is 1. If d_j is in $D_{i}^{λ}$ (rep_λ (d_i, d_j) =1), d_j has the qualification to vote for $D_{i}^{λ}$ . The number of the appearance of d_j in the representation set is denoted as n_j. Then the number of votes cast by d_j for each representation set that it has appeared is vote = 1/n_ij. At last, the votes of the representation set $D_{i}^{λ}$ for each round are counted, which is the sum of the votes cast by d_j.

$\begin{matrix} {vote}_{i} & = & \sum_{d_{j} \in D} {vote}_{j}^{i} \times {rep}_{λ} (d_{i}, d_{j}) \\ = & \sum_{d_{j} \in D} {rep}_{λ} (d_{i}, d_{j}) / n_{j} \end{matrix}$ (3)

The above method is less effective in calculating the processing speed and energy efficiency for the case of a large number of representation sets. To address this problem, the MapReduce model is introduced to improve the speed of the algorithm.

Redundancy optimization. According to the votes of each representation set, the best votes of representation sets in $[α \cdot max ({vote}_{j}^{i}), max {vote}_{j}^{i})]$ are to be found. The representation set with minimum redundancy d_j is selected and added into R. d_j satisfies the following conditions. ${vote}_{j} \in [α \cdot max ({vote}_{j}^{i})]$ (4) $d_{j} \in min (\frac{1}{| R |} \times \sum_{d \in R} sim (d_{j}, d))$ (5)

Remove the d_j added into R and its representation data to obtain the new dataset D and the representation set $D_{i}^{λ}$ . Repeat the steps (2) and (3) until there is no data needed to be added into R.

2.2 Resource transformation and clustering

The virtual enterprise financial resource transformation refers to the unification of the original heterogeneous data the process of data transformation. The financial resources of virtual enterprises are distributed in different storage centers, and the storage types and formats of each central data are different. In order to achieve data interaction and sharing, the financial resource format of virtual enterprise is unified. By using the ontology description language RDF, a data transformation mechanism for virtual enterprise financial resources is constructed [10]. Then users are provided with a transparent and unified resource service pool. The transformation of virtual enterprise financial resources is described as the following relationship. $f (MDataB, MDataA, r, DataB) \to DataA$ (6) where MDataB is the financial local data model of virtual enterprise, MdataA is the financial global data model, DataB is the dataset of MDataB, r is the transformation rule of local data DataB to global data DataA, f is the transformation relation mapping of MDataB, MdataA, DataB, and r to DataA.

The data transformation and mapping technology [11] is used to build data sources of financial members of virtual enterprises. From the local model to the mapping rule library of the global model, the data aggregation of financial members in virtual enterprises is realized. It is convenient for management while providing users with unified data interface services.

The above process is mainly aimed at the collection and transformation of virtual enterprise financial resources under the big data environment, which is the first problem to solve the integration of virtual enterprise financial resources. The next research content is to classify and combine the financial resources of various virtual enterprises gathered in the network platform to achieve the integration of financial resources for virtual enterprises.

The similarity matrix W is constructed with the similarity equation, and then the construction matrix L is obtained. The eigenvectors corresponding to the largest κ eigenvalues of the construction matrix L are selected and normalized. In the κ-dimensional space, it forms the expression corresponding to the original data, and achieves the purpose of column dimension reduction.

Construct the similarity matrix W. The calculation of similarity is given by $W_{ij} = exp (- {| x_{i} - x_{j} |}^{2} / 2 σ^{2})$ (7) where W_ij = 0, σ is the parameter, x_i and x_j are the two kinds of financial resources.

The construction matrix L = D^-1/2WD^-1/2, where D is the diagonal matrix, given by $D_{ij} = \sum_{j = 1}^{n} W_{ij}$ (8)

Select eigenvectors corresponding to the first κ maximum eigenvalues x₁, …, x_n′ to construct the matrix X = [X₁, …, X_n′] ∈ R^n′×κ.

Normalize the line vector of the matrix X to obtain $Y = X_{ij} / \sqrt{\sum_{j} X_{ij}^{2}}$ (9)

The fuzzy clustering algorithm [12] needs to specify the number of clusters in advance, and it cannot automatically determine the number of clusters based on the data itself. In the fuzzy clustering, the base of clustering fuzzy membership is introduced, and the optimal number of clusters is determined through deleting and combining the cluster centers by membership base. The fuzzy membership base is given by $N_{ι} = \sum_{j = 1}^{κ} u_{ι j}$ (10) where N_ι is the sum of fuzzy membership of all samples in the ιth data class, that is, the fuzzy membership base of the ιth cluster, u_ιj is the fuzzy membership of the jth sample belonging to the ιth cluster.

Parameter initialization. Initial the cluster number C is the maximum cluster number C_max, ɛ is the iteration threshold, ɛ₁ is the clustering fuzzy membership base threshold.

Initial fuzzy membership matrix U⁰ and the cluster center V⁰, t = 0 is the number of iterations.

Calculate the clustering fuzzy membership base N_ι. If N_ι < ɛ₁, delete this cluster v_ι (ι = 1, 2, …, C) and update the clusternumber C.

If C < C_max, C_max = C.

According to the fuzzy threshold of clustering fuzzy membership, the cluster is deleted and clustering number is decreased. When reaching the iterative threshold, it indicates that the number of optimal clusters has been found, and the number of clusters is automatically determined, so as to prepare for fast automatic clustering. When the iterative threshold is reached, it indicates that the optimal number of clusters has been found, and the number of clusters is automatically determined.

In fuzzy clustering algorithm, the weighted index m′ has great influence on the process and result of clustering. The larger the value of m′, the smaller the objective function value and the lower the noise. However, the larger the value of m′, The more fuzzy the clustering results. It usually takes many experiments or experiences to obtain m′. m′ is added to particle encoding. By giving a certain speed change, it can follow the particle swarm evolution [13] to find the right value.

Assume in the D′-dimensional search space, there is a population with w particles. A function corresponding to each big data information feature vector X_ι is expressed as. $l_{ι} (κ) = (1 - ρ) l_{ι} (κ - 1) + f (x_{ι} (κ))$ (11) where f (x_ι (κ)) is the fitness function of X_ι, ρ is the global optimization particle weight of the ιth particle at the time κ.

Set the threshold N_th. When N_eff < N_th, the probability of the movement of the ηth particle is x_k′+1 = sin(a/x_k′), -1 ≤ x_k′ ≤ 1, where x_k′ is the k′th dynamic inertia weight, a is the control parameter of cluster center. Probability density function of optimal clustering solution is $q (x_{k^{'}}^{ι} / x_{k^{'} - 1}^{ι})$ . According to the update iteration order in the meme group, Σ_τ = diag (max(σ_τ - τ, 0)) is obtained. According to different data clustering tasks, the inner weight of fitness function is adjusted to obtain the weight coefficient of clustering of particle swarm optimization algorithm. ${\begin{matrix} ω = ω (t) * ω_{s}, κ \geq α \\ ω = ω (t) * 1 / ω_{e}, κ < β \end{matrix}$ (12) where {α, β} is the diversity convergent objective function. The optimized clustering objective function is given by

$\begin{matrix} J_{m} (U, V) = \sum_{κ = 1}^{n^{'}} \sum_{ι = 1}^{c^{'}} μ_{ik}^{m} (d_{ik})^{2} x_{ι} \\ = x_{ι min} + c^{'} x_{ι} \cdot (x_{ι max} - x_{ι min}) \end{matrix}$ (13) where the location of the corresponds to the o clustering centers of sample data. Besides particle location, the fitness and velocity of the particle are coded. Attribute vector dimension of sample data is d then the position and velocity of the particle is o × d-dimensional matrix.

The particle swarm optimization algorithm is prone to premature and slow convergence. The chaos mapping method is used to optimize the particle swarm optimization, to lead the particle to escape the local optimal solution and to accelerate the convergence. In the chaos method, first a chaotic sequence is generated with Logistic mapping, which is expressed as $Z_{n^{'} + 1} = μ Z_{n^{'}} (1 - Z_{n^{'}})$ (14)

In order to improve the convergence speed and global optimization ability of particle swarm, the generated chaotic sequence is used to disturb the global optimal particle. For the above w particles, each dimension is mapped to the range (0, 1) to obtain D = (d₁, …, d_w), where d_φ is the φth dimension of the particle, expressed as $d_{φ} = ({gt}_{φ} - a) / (b - a)$ (15) where gt_φ is the φth dimension of the particle with the highest fitness, a and b are the lower limit and upper limit of the value of the particle in anydimension.

The chaos perturbation [15 –17] is used to do iterative computation and obtain new sequence Z₁ = (Z₁₁, …, Z_1w). The obtained new sequence Z₁ is taken as new particle and its fitness is calculated. If the fitness of Z₁ is higher than the optimal solution obtained by previous search, Z₁ is taken as the current optimal solution.

2.3 Resource integration

Assume there are $\dot{n}$ service records provided by virtual enterprise financial resources providers for task requests. $S_{s} = {S_{1}, \dots, S_{\dot{n}}}$ is the data resource service set, S_ξ is the service of the ξth resource provider in historical data. sr_ξ is the set of the service resource S_ξ provided by the enterprise resource provider for the task request, where sr_ξς is the ςth service resource of S_ξ. Service support of resource provider Support is expressed as $Support (A \Rightarrow B) = P (AB)$ (16) where Support (A ⇒ B) represents the probability of resource A and B appearing in some data service at the same time. It the probability is small, A is unrelated with B. If the probability is large, A is related with B.

Service confidence C is expressed as $C (A \Rightarrow B) = S (A \Rightarrow B) / S (A)$ (17) where C (A ⇒ B) reveals how much probability of resource B is selected when enterprise financial resource A is selected by resource customer.

Service upgrading of resource provider is given by $Lift (A, B) = C (A \Rightarrow B) / S (B)$ (18) where Lift (A, B) describes the correlation of the resource A and B.

After the static integration, the data resource service system can provide many service packages to the customers, but the system needs to dynamically analyze and process the customer’s request for service resources when the existing system cannot meet the demand. This dynamic process of integration service resource is called resource dynamic integration process. In the process of dynamic integration, when service resource is provided, data resource provider must meet the hard constraints of service demander. For soft constraints, if there is no fully matched service resource, it can be recommended to the highest matching resource service. The constraints of enterprise service resources are as follows.

The price reflects the quality of service to a certain extent. The price of personal service demand is the detailed restriction of service demander to data resource service or service resource. In the demand of group data resource service, the price constraint reflects the interests and requirements of the whole enterprise financial data resource service demand group. There is a functional relationship between the deviation of price constraint and the satisfaction of customers’ satisfaction. The relationship between lowest price and customer satisfaction is approximately consistent with exponential function, given by ${cs}_{ξ} = {\begin{matrix} 1 > (p \geq p_{min}) \\ (p / p_{min}), (0 \leq p < p_{min}) \end{matrix}$ (19) where cs_ξ is the customers’ satisfaction, p_min is the accepted lowest price. The relationship between the highest price and the customers’ satisfaction is given by ${cs}_{ξ} = {\begin{matrix} 1, (0 < p \leq p_{max}) \\ (1 - (p - p_{max}) / p_{max}), (p > p_{max}) \end{matrix}$ (20) where p_max is the accepted highest price.

Financial data resource service demand requests for service resource include service time constraint and service quantity constraint. Service time constraint refers to the how much service time of the service provider satisfying the required service time of the customer. The satisfaction of service demand time constraint with service demand is calculated by using

${cs}_{ξ} = {\begin{matrix} 1, (s . startTime \geq d . endTime, s . endTime \leq d . startTime) \\ \frac{(s . startTime, s . endTime) \cap (d . startTime, d . endTime)}{(d . startTime, d . endTime)} \end{matrix}$ (21) where s. startTime is the starting time of the service, s. endTime is the ending time of the service, d. startTime is the starting time of the required service by the service demander, and d. endTime is the ending time of the required service by the service demander.

Service quantity constraint refers to the number of service resources allocated by the service meet the demand of the users, which is given by Eq. (22). The ratio is larger than 1, the satisfaction of service number constraint is 1. ${cs}_{ξ} = {\begin{matrix} 1, (N_{s} \geq N_{d}) \\ N_{s} / N_{d}, (N_{s} < N_{d}) \end{matrix}$ (22) where N_s is the number of the allocated services, and N_d is the number of the required services.

The constraint of service provider’s quality of service and credibility of service institution refer to the quality rating of service providers’ service quality and credibility rating. The calculation is given by ${cs}_{ξ} = {\begin{matrix} 1, s . {att}_{j} \geq d . {att}_{j} \\ s . {att}_{j} / d . {att}_{j}, s . {att}_{j} < d . {att}_{j} \end{matrix}$ (23) where s. att_j is the attribute of the service quality att_j, and d. att_j is the attribute of the service demand att_j.

In the financial data resource service, the constraints of the service mode, service domain, resource category, resource subcategory, service resource provider, and service nature are hard constraints. The constraint satisfaction is calculated by using ${cs}_{ξ} = {\begin{matrix} 0, s . {att}_{k} \neq d . {att}_{k} \\ I, s . {att}_{k} = d . {att}_{k} \end{matrix}$ (24) where s. att_k is the attribute of the service resource att_k, and d. att_k is the attribute of service demand att_j.

In the process of selection and combination of service resources, how can we not only satisfy the service needs of resource demander, but also make the overall utilization rate of service resources reach the highest? For this problem, an improved artificial bee colony based virtual enterprise financial resource dynamic integration algorithm is proposed to dynamically achieve the integration of different needs.

After similarity calculation, similar service needs are merged. Assume service demand set is SD = {d₁, ⋯, d_cn}, where d_ξ (ξ = 1, 2, …, cn) is the demand of the ξth service demander.

Assume there are $\ddot{n}$ service resources for customers to be selected. These service resources constitute subclass service resource set SR_φ = {sr_φ1, …, sr_φn}. sr_{φ
_φ} is the ϑth service resource of the φth subclass service resource.

Satisfaction of the service provider to resources is calculated by using $sc = \sum_{φ = 1}^{cn} \sum_{ϑ = 1}^{n} {sc}_{φ φ} * W_{φ}$ (25) where sc_φφ is the satisfaction of the φth service demander to the ϑth constraint, W_φ is the weight of the ϑth constraint for the evaluation of service satisfaction, and $\sum_{φ = 1}^{n} w_{φ} = I$ .

According to the service property of virtual enterprise financial resource, resources are divided into continuous service resource and intermittent service resource. For different types of financial data resources, resource utilization rate is given by $ut = \frac{\sum ({endTime}_{d} - {startTime}_{d})}{{endTime}_{xr} - {startTime}_{xr}}$ (26) $ut = \sum {number}_{d} / {number}_{xr}$ (27) where startTime_d is the starting time of the service demand, endTime_d is the ending time of the service demand, startTime_xr is the effective starting time of the service, endTime_xr is the effective ending time of the service, number_d is the number of the requested service resources, number_xr is the number of the provided service resources.

Static service resource integration is the optimal financial service resource solution to meet specific service needs. Based on the static integration process of service resources, according to the priori nature of financial services, these service resource combinations may soon be reused. Priori scheme set of financial data resource service is defined as $PSS = {{ss}_{1}, \dots, {ss}_{px}}$ (28) where ss_φ (φ = 1, 2, ⋯, ps) is the resource combination scheme.

In the financial data resource service system, the service resources with more customer requests and higher service quality are integrated into the prior set of virtual financial service resources, which is defined as

$\begin{matrix} PriS & = & {s_{φ 1}, \dots, s_{φ ρ} | \forall s_{φ φ} \\ . TQ \cap P (s_{φ φ}) \geq TP} \end{matrix}$ (29) where s_φφ is the φth candidate service resource of the resource set with more customer requests and higher service quality in the subclass service resource set s_φ, TQ is the service resource quality threshold, TP is the service resource priori threshold

According to the similarity between service resources, the category of service resources can be continued to narrow on the basis of subclass resources. The definition is given by

$\begin{matrix} SimS & = & {s_{φ 1}, \dots, s_{φ φ}}, \\ \forall Sim (s_{φ}, s_{φ}) \geq TS \end{matrix}$ (30) where s_φφ is the φth candidate service resource with the similarity larger than or equal to the threshold in the subclass service resource set s_φ, TS is the similarity threshold.

A dynamic integration algorithm based on artificial bee colony is proposed to solve the problem of dynamic selection, combination and rapid establishment of service supply and demand relationship of financial data resources during serviceprocess.

The solution of combined service resources is to provide services for the complex resources of the service provider in real-time after the service demand requested by the service demander. According to the resource constraint of financial service demand, the subclass service resource is determined. The food source in artificial bee colony algorithm is introduced. The encoding mode is given by $X_{τ} = {s_{l 1}, \dots, s_{l σ}}$ (31) where X_τ is the food source consisting of a vector of $\bar{n}$ -dimensional service resource, s_{φ
_φ} is the φth candidate service resource in the φth subclass service resource in the food source X_τ.

When initializing combined service resource solution, the first part is generated in a priori solution of financial data resource service. The second part initializes the solution in a priori concentration. The third part is from the initialization of similar centralization of financial data resources. The fourth part is from the centralized generation of general service resources. Assume the size of the initial combined financial data resources is SN, the proportions of the first, second, third and fourth parts of the initial solution are α′, β′, χ′, δ′ (α′ + β′ + χ′ + δ′ = 1). The initialization steps of the combined financial data resource solution are asfollows.

Calculate the priori value VPS and user satisfaction SVPS of the prior financial data resource service scheme in PSS.

According to the size of the service priori value, VPS is sorted in descending order.

Select the priori service scheme with SVPS ≥ TSV, TSV is the user satisfaction threshold.

Repeat the step (3) until obtain α′ × SN combined service resources.

Random select a subclass resource s_∂ in the priori set PS. A candidate service resource is randomly selected from the service resource set until the candidate service resources of all subclass resources are traversed to select.

Until obtain α′ × SN combined service resources.

A candidate service resource in each subclass resource s_∂ is randomly selected from the financial service resource similarity set SimS until the candidate service resources of all subclass resources are traversed to select.

Repeat the step (7) until obtain χ × SN combined service resources.

A subclass resource is randomly selected from the general set GenS of financial data service resource s_∂, and a candidate service resource is selected randomly from the candidate service resource set until the candidate service resources of all subclass resources are traversed to select.

Repeat the step (7) until obtain δ′ × SN combined service resources.

Artificial bee colony algorithm is based on the best fitness of food source to evaluate the quality of food source. The selection of fitness function directly affects the convergence speed of artificial bee colony algorithm and the ability to find the optimal solution. According to the characteristics of financial data service resource composition problem, the fitness function of the combined financial data service resource solution is designed. A combined financial data service resource solution is expressed as X_τ = {s_l1, …, s_lη}, where s_l is the ηth candidate service resource of the τth subclass service resource in the food resource X_τ, then the fitness function of the combined service resource solution X_τ = {s_l1, …, s_lη} is given by $fit (X_{τ}) = μ \times sa + η \times u$ (32) where sa is the satisfaction of service demand customers to the combined financial data service resource, u is the resource utilization rate, μ is the satisfaction of service demander, η is the weight of the resource utilization rate.

There is a certain partial sequence relationship between financial data services. Based on the knowledge of this domain, a dynamic adaptive neighborhood search strategy with direction for the priori scheme set of financial data resources service, the priori set of service resource, and resource similarity set is proposed. The search process first determines the direction of search and the search step, then searches and generates new food sources in theneighborhood.

3 Results

In the experiment, four original enterprise information data is used with the types of My sql, Xml, Txt, and Excel. My SQL data is from application platform, and other data sources are from manual collection and network collection. All data in these 4 data sources are integrated by the above mode. The configuration of the integration environment is: CPU Intel(R) Core(TM)3 i3 M370 2.40 GHz, 2GB memory, and 500 GB hard disk. The performance comparison of energy consumption, service-level protocol violation rate, and service customer satisfaction between resource integration algorithm based on improved artificial bee colony, resource integration algorithm based on phase space reconstruction, resource integration algorithm based on depth learning model, and resource integration algorithm based on virtual cluster online migration are carried out.

For the requirements of low energy consumption and high service quality by virtual enterprise financial resource data center, comprehensive evaluation index of service quality and energy consumption ESV is set, which is given by $ESV = EC \times SLAV$ (33)

The lower the value of ESV, the less energy consumption and high service quality of resource integration.

Figures 1 and 2 show the comparison results of energy consumption and service-level protocol violation rate between different algorithms, respectively.

Fig.1

Comparison of energy consumption between different algorithms.

Fig.2

Comparison of service-level protocol violation rate between different algorithms.

From Fig. 1, it can be seen that, with the increase of data volume, the energy consumption of the 4 algorithms is changing little. The energy consumption and the service-level protocol violation rate of the proposed algorithm are always lower than those of the other 3 algorithms. From Fig. 2, it can be seen that, when the amount of financial data of virtual enterprise is 25TB, the resource integration algorithm based on virtual cluster online migration achieves the maximum service-level protocol violation rate. This algorithm transfers the virtual cluster in the isomorphic subdomain without taking into account the customer needs. The optimality of the resource integration result depends entirely on the migration quality of the virtual cluster, which makes the service protocol violation rate higher. For the resource integration algorithm based on phase space reconstruction, the integration of objective demand is continuously improved in the process of resource integration. Therefore, service protocol violation rate is lower than the resource integration algorithm based on deep learning model and integration algorithm based on virtual cluster online migration, but higher than the proposed algorithm.

Figure 3 shows the comparison results of service customer satisfaction between the proposed algorithm, resource integration algorithm based on phase space reconstruction, resource integration algorithm based on depth learning model, and resource integration algorithm based on virtual cluster online migration within the same time. The satisfaction is quantized as a constant C and the optimal satisfaction is 1. From Fig. 3, it can be seen that, with the scale of resources increases, the optimality of the 4 resource integration algorithms reaches a peak value and then decreases. In the resource integration algorithm based on depth learning model and resource integration algorithm based on virtual cluster online migration, the user demand is not considered. The optimality of the results depends entirely on the quality of the resources. In the proposed algorithm, the extracted resources are extracted and the service resources that satisfy the requirements of the user service requirements are selected and combined from the unified data resource after the transformation. It takes into account the service requirements of the different needs and make the corresponding customer satisfaction higher.

Fig.3

Comparison of the service customer satisfaction between different algorithms within the same time.

Data matching method is used to verify the improvement of data quality in the process of virtual enterprise financial resources integration. The integrated data obtained after experiment is verified by manual experiments, and the real financial resources of enterprises are obtained as an accurate contrast data. The experimental data is compared with the accurate data.

Comparing the original data, the integrated target data, and the real data, the accuracy is obtained. The following simple equations are used to estimate the accuracy of data. The first is to calculate the similarity of each attribute. The second is to calculate the weighted similarity of data in the dataset, which id the accuracy of the dataset. $A [D_x (W_{i})] = (\sum_{j = 1}^{n} C_{ij}) / n$ (34) $A^{'} (D_{x}) = \sum_{i = 1}^{k} {A [D_x (W_{i})] * Q_{i}}$ (35) where D _ 1 to D _ 4 is 4 original data, C_ij is the similarity between the attribute W in D _ x and the corresponding attribute in the accurate data. In order to simplify, the similarity of data is approximately calculated by whether the comparison data is equal. Q_i is the weight of the attribute. In order to ensure consistency of comparison, the same weight is used for accuracy estimation.

Set the accurate of real data is 100%. Comparison between the original dataset and the integrated data is carried out to obtain the data accuracy. The results are shown in Table 1.

Table 1

Comparison of data accuracy before and after integration

Original data	Accuracy before integration/(%)	Accuracy after integration/(%)	Improvement of accuracy/(%)
My sql dataset	72.36	89.36	17.00
Xml dataset	55.84	88.69	32.85
Txt dataset	65.25	93.36	28.11
Excel dataset	55.36	97.25	41.89

From Table 1, it can be seen that, after integration, the accuracy of the virtual enterprise financial data set increased by 17%, 32.85%, 28.11% and 41.89%, with an average increase of 29.96%. Compared with the original data, the quality of the original data is obviously improved. After the extraction, transformation and classification of enterprise financial resources, the quality of the original data is improved automatically, and it can provide data services for the enterprise platform more effectively.

4 Conclusions

In this paper, the current research status of big data environment and virtual enterprise financial resources integration is summarized. The financial resources integration mechanism of virtual enterprise is designed. Finally, taking a company’s financial data as an example, the proposed virtual enterprise financial resource integration mechanism is verified.

Innovative research results are as follows. The data integration mechanism of virtual enterprise financial resource extraction, transformation, and combination is revealed; The integration technology of virtual enterprise financial resource is researched. In order to better serve the financial service demander, the virtual enterprise financial resource integration is realized. It includes static integration of enterprise financial resources and dynamic integration of real-time response. Association rule mining algorithm and artificial bee colony algorithm are applied to the integration process of enterprise financial resources, which effectively improves customer satisfaction and resource utilization.

In the big data environment, the virtual enterprise financial resource integration algorithm needs to be constantly refined and perfected. We need to consider the extraction of real-time dynamic data resources, and design a better financial resource extraction mechanism. The effective combination of data resources under the condition of limited financial resources of virtual enterprise needs further research.

References

Yao

and Deng

, Dynamic Resource Integration Optimisation of Global Distributed Manufacturing: An Embeddedness-interaction perspective, International Journal of Production Research54(23) (2016), 7143–7157.

Wen

, Li

, Jin

, et al., Energy-efficient virtual resource dynamic integration method in cloud computing, IEEE Access (99) (2017), 1–1.

Xing

and Xia

, Comparison of centralised scaled unscented kalman filter and extended kalman filter for multisensor data fusion architectures, Iet Signal Processing10(4) (2017), 359–365.

Mazher

, Li

, Moughal

T.A.

, et al., A decision fusion method using an algorithm for fusion of correlated probabilities, International Journal of Remote Sensing37(1) (2016), 14–25.

Wei

X.H.

, Shen

X.R.

and Li

H.L.

, Virtual resource consolidation algorithm based on virtual cluster live migration, Journal of Jilin University(Science Edition)54 (2016), 77–84.

Zhao

, Gao

Z.Y.

, Gao

J.M.

and Wang

R.X.

, A fusion method of multisource data using phase space reconstruction, Journal of Xian Jiaotong University50 (2016), 84–89.

Y.J.

, Xue

Y.H.

, Liu

and Li

Y.J.

, Data aggregation algorithm based on the model of deep learning, Journal of Tianjing University of Science & Technology32 (2017), 71–74.

Tang

J.M.

and Wang

, Clustering data fusion algorithm based on relay node mechanism, Journal of Yunnan University(Natural Sciences Edition)38 (2016), 703–707.

Fei

X.J.

and Li

X.F.

, Wireless sensor network data fusion algorithm based on compressed sensing theory, Journal of Jilin University (Science Edition)54(3) (2016), 575–579.

10.

Gao

, He

, Gao

, Zhan

and Wu

, Design of an efficient multi-objective recognition approach for 8-ball billiards vision system, Kuwait Journal of Science45(1) (2018), 39–53.

11.

Subklay

and Pochai

, Numerical simulations of a water quality model in a flooding stream due to dam-break problem using implicit and explicit methods, Journal of Interdisciplinary Mathematics20(2) (2017), 461–495.

12.

De-Tian

, Hai-Yan

and Zhi-Guo

, Effects of fluctuations in international oil prices on China’s price level based on VAR model, Journal of Discrete Mathematical Sciences & Cryptography20(1) (2017), 125–135.

13.

Santos Bruzon

and Maria Garrido

, Symmetries and conservation laws of a kdv6 equation, Discrete and Continuous Dynamical Systems-Series S11(4SI) (2018), 631–641.

14.

and Liu

, A study on the impact of environmental education on individuals’ behaviors concerning recycled water reuse, Eurasia Journal of Mathematics Science and Technology Education13(10) (2017), 6715–6724.

15.

Shen

, Zhao

, Xia

and Du

, A deep q-learning network for ship stowage planning pproblem, Polish Maritime Research24(SI) (2017), 102–109.

16.

Birs

, Muresan

, Folea

and Prodan

, A comparison between integer and fractional order pdμ controllers for vibration suppression, Mathematics and Nonlinear Sciences1 (2016), 273–282.

17.

Awadalla

N.S.

, Hanna

M.A.

, Ismail

, Hassan

I.A.

and Elkhamisy

M.A.

, period variation study and light curve analysis of the eclipsing binary gsc 02013-00288, Mathematics and Nonlinear Sciences1 (2016), 321–334.