DRC-EDI: An integrity protection scheme based on data right confirmation for mobile edge computing

Abstract

As far as mobile edge computing is concerned, it is necessary to ensure the data integrity of latency-sensitive applications during the process of computing. While certain research programs have demonstrated efficacy, challenges persist, including the inefficient utilization of computing resources, network backhaul issues, and the occurrence of false-negative detections. To solve these problems, an integrity protection scheme is proposed in this paper on the basis of data right confirmation (DRC). Under this scheme, a two-layer consensus algorithm is developed. The outer algorithm is applied to establish a data authorization mechanism by marking the original data source to avoid the false negative results caused by network attacks from the data source. In addition, blockchain-based mobile edge computing (BMEC) technology is applied to enable data sharing in the context of mobile edge computing while minimizing the network backhaul of edge computing. Based on the Merkle Tree algorithm, the inner layer algorithm is capable not only of accurately locating and promptly repairing damaged data but also of verifying all servers in the mobile edge computing network either regularly or on demand. Finally, our proposal is evaluated against two existing research schemes. The experimental results show that our proposed scheme is not only effective in ensuring data integrity in mobile edge computing, but it is also capable of achieving better performance.

Keywords

Mobile edge computing integrity data right confirmation consensus merkle tree

1. Introduction

With the rapid development of the Internet of Things (IoT) industry, mobile edge computing (MEC) has become a new paradigm [11,13]. It integrates artificial intelligence [7] and advanced communication technology [24] and provides reliable service support for delay-sensitive applications [1] due to its unique low latency and high resource utilization.

Although edge computing provides relatively sufficient resources for terminal devices with poor computing power, due to unpredictable task sizes and numbers of terminal computing devices, there will still be excessive network load due to the edge servers’ computing tasks. With the above challenges, collaborative computing [15,20,23] and edge caching [12,27,33] technology have become effective methods for breaking edge networks’ resource constraints.

Collaborative computing technology and edge caching technology enable an edge server’s computing tasks or services to be shared with other edge servers. These servers can then jointly compute and process data. At present, many studies [8,26,36] assume that edge servers are 100% reliable and secure, which is unreasonable. It is not unreasonable to assume that data in one of the shared edge servers is changed or corrupted during computational processing. This event will directly affect the results of the calculations. It may even cause the task to fail. These calculations will then need to be resubmitted because they returned incorrect parameters. In addition, edge servers using cooperative computing technology will generate some cooperative tasks. Once the cooperative task calculation fails, all edge servers participating in the calculation need to recalculate the subtasks and recheck the integrity of the subtasks. This computational impact is much more significant than that of non-cooperative tasks. In the Appendix, we give a proof. Therefore, for each edge server, ensuring the integrity of computing task data has become a significant challenge in edge computing applications.

Influenced by the data protection strategy of cloud computing, most of the data protection research related to edge computing is based on the challenge-proof strategy [10,37]. In this strategy, the computing device or service provider initially initiates a data check request with the edge server. The edge server that receives the request will perform local computing verification based on the correct data replica digest provided by the computing device. When the verification is successful, the edge server communicates the verification result to the service provider; otherwise, the terminal computing device must resend the computing data. The computation and network backhaul costs are enormous [28] if this method occurs in many edge servers under collaborative computing.

In addition, in the ever-changing edge network environment, it is essential to confirm the rights of the data itself, that is, to record the data usage process. It is assumed that after the terminal computing device generates a computing task, a penetration attack will occur during the request to uninstall the edge server, resulting in the task being tampered with. The terminal computing device sends out an incorrect task, and the edge server will obtain an incorrect calculation result through collaborative or local calculation after receiving the task. This process wastes the computing resources of the edge network and increases the network load.

Therefore, this paper proposes a protection scheme based on data confirmation to solve the above-mentioned problems. The scheme realizes the data right confirmation operation based on the blockchain. When the data replica of an edge server is damaged, it can be tracked to the nearest edge server according to the data information stored on the blockchain to obtain a reliable data replica. At this juncture, the edge server is solely tasked with validating the integrity of the data stored within the blockchain. It does not need to interact with other servers for confirmation, which significantly reduces the number of network backhauls. In addition, we have adopted a new consensus mechanism that can regularly or on-demand verify the data replica in the edge server to ensure the correctness of the edge server’s computing tasks. This avoids the waste of edge server computing resources and solves the problem of false negative detection results. The contributions of this paper are as follows:

We propose a lightweight protection scheme based on data permission confirmation, which is a data-centric scheme. In this scheme, we first innovatively proposed a two-round consensus mechanism based on PoW and expanded two new algorithms to form a complete consensus algorithm. The algorithm can check the calculated data content in time and accurately locate the specific location of the damaged data.

We solve the problem of false-negative data detection results and combine BMEC technology to minimize network backhaul and fully use edge server computing resources.

We evaluate our scheme against some existing advanced schemes. Experimental results and security analysis show that our scheme can achieve a 100% damage detection rate under different data copies and blocks and have relatively low detection delay consumption. This scheme can effectively guarantee data integrity in edge computing and significantly save computing and communication resources of the edge network.

The rest of the paper is organized as follows: In Section 2, we summarize the current data protection schemes for edge computing. In Section 3, we define the general architecture of the related system and discuss the related work in detail. In Section 4, we expound upon the blockchain consensus network, elucidating the pertinent algorithms encompassing both internal and external aspects. In Section 5, we evaluate the performance of the scheme and compare it with other schemes. We then conclude the paper in Section 6.

2. Related work

BMEC is an innovative architecture for mobile edge computing [18] that will play a vital role in the future development of IoT and 5G/6G communications. MEC can provide devices with rich computing power in a mobile blockchain environment. Moreover, the blockchain can provide a distributed database to store various data, such as transaction content of computing devices, device information, etc. The blockchain packages them into a series of blocks. It reaches a consensus in the mobile edge computing network, which not only solves the resource limitation of the blockchain itself but also provides security such as resource traceability and transaction confirmation to MEC.

At present, many MEC safety studies are using BMEC technology. For example, [30,34,35] implemented an effective mobile device authentication scheme using BMEC technology and [3,5,25] applied BMEC technology to the unmanned aerial vehicle (UAV) network. Some studies [9,16,29,31] also use BMEC architecture for data integrity protection. [9] proposed a dynamic BMEC data integrity protection architecture and a two-layer consensus algorithm for changes in blockchain members. This scheme uses string consensus, data correctness verification and binary consensus to achieve the verification purpose. Since the verification process is random, it may generate many blockchain transactions and waste computing resources. [31] proposed a BMEC security architecture using multiple data stores and blockchain agents to achieve real-time contextual data integrity in IoT environments. [29] proposed a lightweight BMEC framework that supports cross-platform software systems that are easy to deploy, scalable, and cost-effective. This scheme mainly focuses on the data protection process, using authentication and encryption technology to protect the operation of sensitive data without considering the repair stage after data damage. [16] proposed a data integrity protection scheme based on BMEC for a specific industrial Internet of Things, mainly protecting data tasks with less sensor calculation. Therefore, based on the above research, BMEC technology can be used as a promising data integrity protection method.

In addition, some studies [6,14,19,32,37,39] proposed a provable data possession (PDP) scheme for data detection proof and restoration. In PDP, all request verification processes are edge server interaction processes at different stages. This individual checks and interactions method is unsuitable for edge computing networks with strict latency requirements. A similar proof of retrievability (PoR) scheme [2,4,10,38] has also been published. The paper [6] conducted a detailed study and proposed a binary sampling scheme based on homomorphic labelling called ICL-EDI. The ICL-EDI scheme simplifies complex calculation proofs. It uses binary sampling technology to reduce the computational overhead of edge networks significantly and allows terminal computing devices or service providers to perform one-to-many verification. Therefore, the ICL-EDI scheme only accelerates the data verification and certification process of edge computing.

The paper [17] proposed a distributed self-management scheme called CooperEDI. The CooperEDI scheme differs from the above scheme in that it does not require the edge server to always issue a proof of calculation, that is, the correct hash calculation result of the target calculation. The CooperEDI scheme initially determines the edge servers using the correct data through pairwise interactions, and these correct edge servers then become managers in the scheme. When other non-manager data are damaged or tampered with, the manager locates and repairs the damaged data. In the CooperEDI scheme, there is an extensive network backhaul problem in the process of mutual verification between edge servers. If the terminal computing device or server requests too many computing tasks at this time, congestion of the edge computing network may occur. Second, the CooperEDI scheme identifies managers by setting a half-threshold when identifying scheme managers. For example, assuming that a total of n edge servers experience a data attack while determining that the replicas are identical, the number of real replicas is less than $(n + 1) / 2$ , and the number of identical false data replica is greater than the number of identical true data replica. Misjudgment is possible, which is devastating for edge computing networks.

In summary, some current research schemes still have the problem of wasting edge server computing resources, network backhaul, and false-negative data detection results [21]. To solve the above problems, this paper proposes a more novel and flexible BMEC data integrity verification scheme, which performs data integrity verification based on the data source.

3. System overview

Fig. 1.

System general architecture.

The scheme proposed in this paper applies to the general architecture of edge computing. As shown in Fig. 1, the general architecture of edge computing consists of a three-layer structure of cloud, edge and terminal. In the cloud layer, service providers can cooperate with cloud data centres and deploy complete service content to cloud data centres. In order to improve the extensiveness of services, cloud data centres generally offload some service content to the edge layer. The edge layer is mainly composed of edge servers. When the edge server receives a service task, it can cooperate with other edge servers or cloud data centres to complete it and return the calculation result to the computing device at the terminal layer. These computing devices must be within the edge server’s communication range. In addition, the edge server and the cloud data centre jointly build a blockchain network and have the exact logical definition; whether it is communication between servers or between the server and the cloud data centre, it will be legally recorded on the blockchain.

Our proposal can be deployed in any mobile edge computing scenario and is applicable to any computing form. For example, Metaverse [22] is an emerging digital Internet application. Metaverse generates a mirror image of the real world based on digital twin technology. Users participate in the Metaverse through a shared edge infrastructure. Service providers usually cache services on nearby edge servers. Therefore, we need to ensure the integrity of the cache service to ensure that it can connect to the virtual world under any conditions.

Fig. 2.

System model including the first six processing stages.

The general framework used in this paper is shown in Fig. 2. To visualize the data delivery process, we allow service providers to serve only one edge server and the first edge server to deliver to other edge servers for caching. The service provider is represented by $SP$ , and the edge server is represented by ES-1, ES-2, …, and ES-S, where ES-S represents the first server to receive service caching. This paper divides data protection into six stages and uses the cooperative caching of four edge servers as an example.

(1) Data sharing stage: As shown in Figure-2(a), ES-S caches $SP$ tasks on the local server. During this process, ES-S needs to complete two tasks. First, ES-S executes the inner-layer algorithm and uploads the execution result to the blockchain. Second, ES-S shares service data with to ES-1, ES-2, and ES-3, respectively. Then ES-1, ES-2, and ES-3 ensure the integrity of the service data by reading the blockchain transaction content of ES-S.

(2) Request repair stage: As shown in Figure-2(b), ES-1 also performs two processes after receiving service data. First, ES-1 runs the inner algorithm based on the source data recorded in the blockchain to locate the data corruption. Second, ES-1 queries the information contained in ES-S and sends a partial file update request.

(3) Repair data stage: As shown in Figure-2(c), when ES-S receives the request from ES-1, it dose not send the corresponding data file immediately but instead re-runs the inner algorithm and queries the source data on the blockchain to check whether the local file is correct. When the file is verified to be correct, ES-S sends the partially correct file data to ES-1.

(4) Origin server processing stage: As shown in Figure-2(d), when the data file is checked incorrectly, that is, the source data file in ES-S is damaged, ES-S sends this feedback error information to ES-1. Then, ES-1 will queries other edge servers that share the same data through the blockchain, and sends only partial file update requests to those servers (ES-2, ES-3).

(5) Secondary request stage: As shown in Figure-2(e), when ES-2 and ES-3 receive the request, they run the inner algorithm to verify whether their own data is correct.

(6) Final repair stage: As shown in Figure-2(f), after the data is verified by ES-2 or ES-3, ES-2 and ES-3 send complete and correct data files and partially correct data files to ES-S and ES-1, respectively. If ES-2 and ES-3 fail to verify the data, then ES-2 or ES-3 sends this information back to ES-S and $SP$ , and $SP$ ends the task execution. We present the algorithm design in the next subsection.

4. Algorithm design

Our work developed a two-layer consensus algorithm consisting of an outer layer consensus algorithm and an inner layer consensus algorithm.

4.1. Overview of blockchain

4.1.1. The chain owner

In practical applications, the management right of the blockchain will be owned by the mobile edge computing manager, that is, the chain owner. Each mobile edge server acts as a node in the blockchain. In this research, we set the blockchain nodes as full nodes, that is, each node needs to store the entire blockchain.

4.1.2. Consensus

In this blockchain network, since every full node stores all blocks, all nodes need to participate in the consensus phase for transaction verification, ledger management, and coin rewards. For the transactions generated by the non-cooperative work of nodes, we use a proof of work(POW)-based consensus mechanism to implement. However, due to the complexity of the mobile edge computing environment, there are often tasks that require the cooperation of mobile edge servers to complete. To prevent malicious nodes from tampering, forging, replacing, and deleting some transaction records in the blockchain during the collaboration process, we propose a two-round consensus mechanism based on POW. First, we define a set of cooperative nodes $[1, \dots, m, \dots, M]$ , the remaining nodes $[1, \dots, w, \dots, W]$ , and the node n, which together form the blockchain network.

The first round consensus process: Assume that node n initiates cooperation with all nodes in the cooperative nodes $[1, \dots, m, \dots, M]$ and generates transaction content. At this time, node n signs according to the data type and broadcasts it in the blockchain network. For the cooperative nodes $[1, \dots, m, \dots, M]$ , each node needs to verify this message, and after the verification is successful, the verification process will be signed and broadcast.

The second round consensus process: When all nodes in the blockchain network receive the broadcast feedback of successful verification of all the cooperative nodes, the cooperative nodes $[1, \dots, m, \dots, M]$ and node n directly package the above process into blocks and store them on the blockchain. The remaining nodes $[1, \dots, w, \dots, W]$ will randomly confirm each other in pairs. Once the confirmation is passed, the above transaction process will be packaged into blocks and stored in the blockchain. At this point, the blockchain is in the latest transaction state.

4.2. Outer layer algorithm

The outer layer algorithm serves as the foundation for our work. It provides two primary functions: building a consensus network and a data right confirmation mechanism.

Initialize the consensus network: First, the edge server needs to be mapped to the blockchain network. Each edge server corresponds to a consensus node, and each consensus node must store a complete blockchain to form a consensus network. Due to the dynamic nature of the edge server, when a new edge server is added to the edge computing environment, the outer algorithm needs to assign the address and secret key on the chain for the new server and initialize its tasks, computing resources and other information. Secondly, a transaction is generated in the corresponding blockchain network when the edge server performs a collaborative or computing task. The edge server must also sign the corresponding algorithm and use a two-round consensus mechanism based on PoW to broadcast transaction information. In this paper, signature algorithms are a fundamental task and not the subject of research.

Establish a data right confirmation mechanism: Our approach creates two service record tables in each edge server: the initial storage table $(IST)$ , which stores only the service data first uploaded to the edge server. The other is the global storage table $(GST)$ , which stores all service data not uploaded to the edge server for the first time; for example, the original data will be stored in $GST$ after legal changes. Using data replica X as an example, after storing it in the $IST$ , the relevant edge servers must sign it with a specific signature algorithm and publish it to the blockchain network. If the data replica X is stored in $GST$ , the relevant edge server can use its key to sign and then publish it to the blockchain network. When the blockchain performs a query transaction, a specific signature can be verified to distinguish the transaction to determine whether the transaction content is related to the $IST$ or the $GST$ . The specific process is shown in Fig. 3. This facilitates the subsequent verification work on data X based on the data source and thoroughly monitors the data access. The pseudo-code for this algorithm is presented in Algorithm 1.

Fig. 3.

Algorithmic process for establishing a data right confirmation mechanism.

Algorithm 1

Outer layer algorithm

In summary, Algorithm 1 is an extended algorithm of the above consensus, mainly used to record the authenticity and integrity of blocks during broadcasting. Moreover, check whether the data permission type has changed through two storage tables. The algorithm is implemented through smart contracts, and the input objects are described as computing tasks and edge server ids through pseudocode. The content of the algorithm is mainly initialization work and data permission check work.

4.3. Inner layer algorithm

The inner layer algorithm is based on the Merkle algorithm and is used to execute two primary tasks: data pre-processing and identifying and fixing corrupted data. In this paper, we adopt the Merkle algorithm to preprocess the data. The Merkle algorithm can process information in segments by using a specific data structure, that is, output a hash result by calculating a multi-part hash algorithm.

Algorithm 2

Inner layer algorithm

We divide the Merkle algorithm used in Algorithm 2 into three layers, as shown in Fig. 4. First, we need to preprocess each data replica, divide each data replica according to the data block, and then calculate the hash value of the bottom layer, middle layer and vertex layer through the Merke algorithm, and store these values in the block. When a data replica is traded, first check the hash value of the vertex layer of the data replica. If the hash value of the vertex layer changes, the copy of the data has been tampered with. Then continue to check the hash value of the middle layer and trace the hash value of the bottom layer according to the changed hash value of the middle layer. Therefore, we can know which data block has the problem.

Fig. 4.

Merkle algorithmic process for data preprocessing and data repair.

In summary, this structure empowers the subsequent verification process that pinpoints the exact location of the corrupted copy of the data without having to replace the entire file, significantly improving data repair efficiency. Additionally, the inner algorithm establishes a time-polling and event-triggering framework that enables timely or on-demand verification of all edge servers. The algorithm’s pseudo-code is shown in Algorithm 2.

5. Experimental evaluation

5.1. Experimental overview

In Section 2, we elaborated on the schemes of [6] and [17]. The scheme proposed in [6] is ICL-EDI, which is abbreviated as ICL; the scheme proposed in [17] is CooperEDI, which is abbreviated as CE. In addition, [19] proposed a scheme that uses the Boneh–Lynn–Shacham (BLS) algorithm to implement PDP to ensure data integrity and is abbreviated as P-BLS. Our work compares these three schemes using random sampling detection.

5.2. Experimental setup

To determine the experimental settings, we referred to those in [17]. We set seven parameters, which are:

Sampling scale $(ss)$ : the total number of data blocks sampled from each edge data replica for inspection.

Data replica scale $(drs)$ : the total number of edge data replicas to be inspected.

Data size $(ds)$ : the size of each data replica (using MB as the unit).

Dat block size $(dbs)$ : the size of each data block (using KB as the unit).

Corruption ratio $(cr)$ : the percentage of corrupted edge data replicas.

Corruption severity $(cs)$ : the percentage of corrupted data blocks in a corrupted edge data replica.

Time polling interval $(tpi)$ : the interval at which the data replicas are rechecked.

Among this settings, $tpi$ only affects the DRC scheme. In addition, $tpi$ is adjusted according to the current network’s load degree, and there is no fixed optimal value. Under different network settings, $tpi$ is different. Therefore, in this experiment, after experimental testing, $tpi$ is generally assigned 1 s ∼ 5 s. Furthermore, to generate the evaluation metrics, our work adopts two parts, which are:

Corruption detection rate: this metric is the ratio of the number of detected corrupted edge data replicas over the total number of corrupted edge data replicas. The higher the value is, the better the performance.

Time consumption: this metric is the computation time and communication time taken to complete the data assurance process. The lower the value is, the better the performance.

Table 1
Specific experimental settings

Experimental content Parameter range setting

Sampling scale ( $ss$ ) $50 \sim 100$

Data replica scale ( $drs$ ) $20 \sim 100$

Data size ( $ds$ ) $128 KB \sim 512 KB$

Data block size ( $dbs$ ) $512 B \sim 2048 B$

Corruption ratio ( $cr$ ) $0 \sim 0.2$

Corruption severity ( $cs$ ) $0.01 \sim 0.05$

Experimental content	Parameter range setting
Sampling scale ( $ss$ )	$50 \sim 100$
Data replica scale ( $drs$ )	$20 \sim 100$
Data size ( $ds$ )	$128 KB \sim 512 KB$
Data block size ( $dbs$ )	$512 B \sim 2048 B$
Corruption ratio ( $cr$ )	$0 \sim 0.2$
Corruption severity ( $cs$ )	$0.01 \sim 0.05$

The simulation experiments were performed on four i9-10900KF computers, each with 64 GB of memory. Five virtual machines were created per PC and randomly mapped to a geographic region to simulate the networked edge servers that constitute the edge caching system in that region, including a cloud server. Each virtual machine has 2 GB RAM and uses the Ubuntu 16.04 operating system. According to the ten application test results of the Amazon cloud server, the network delay between different edge servers is set at 5 ms to 15 ms, and the network delay between the cloud server and the edge network server is set at 120 ms. The experimental scheme is mainly realized through Python and Websockets library. Due to the generality of the data, the data during the experiment is obtained by random generation, and the data size is set, as shown in Table 1.

Furthermore, the blockchain was developed and tested using the Ethereum platform. Our work simulates file corruption by randomly modifying $cs$ and $cr$ . The experiment was repeated 10 times to obtain each experimental result, and the average value was calculated.

5.3. Experimental results

During the experiment, P-BLS, ICL, CE and DRC were analyzed and compared in terms of their data damage detection and computing time consumption. In the initial experimental settings, the intermediate values were fixed values from $ss$ , $drs$ , $ds$ , $bs$ , $cr$ , and $cs$ , i.e. ${ss}^{'} = 75$ , ${drs}^{'} = 75$ , ${ds}^{'} = 256$ , ${dbs}^{'} = 1024$ , ${cr}^{'} = 0.1$ , and ${cs}^{'} = 0.025$ . The control variable method was used for testing.

Fig. 5.

Comparison between the corruption detection rate and time consumption of P-BLS, ICL, CE and DRC.

When $ss$ was used as a variable, six sets of data values, namely $ss = 50, 60, 70, 80, 90, 100$ , were tested. Smaller $ss$ values, indicate less data block checking. Figure 5-(a) shows that, relatively speaking, CE and DRC have high test accuracy and can achieve a 100% damage detection rate. However, P-BLS and ICL are not stable, and higher $ss$ values, indicate higher data damage detection for the data replica.

When $drs$ was used as a variable, six sets of data values, namely $drs = 50, 60, 70, 80, 90, 100$ , were tested. The computing time consumption of these four schemes can be found in Fig. 5-(b). In comparison, DRC has better performance, which is partly because due to DRC’s private chain. In the experiment, the private chain was built and deployed using the Ethereum platform and tested by professional caliper testing tools. To make it easy to understand, we use the $Add$ , $Get$ and $Complete$ functions to express the blockchain operation. $Add$ indicates that tasks are added to the blockchain through intelligent contracts, $Get$ represents transactions for reading tasks, and $Complete$ represents feedback after the completion of tasks. In DRC, we randomly created 5000 transaction contents and tested them several times to obtain an average. Figure 6-(b) records the average transmission rate under different operations. It can be seen from the figure the average transmission rate decreases gradually as the blockchain transaction content increases. Because $Get$ does not involve the blockchain’s write operations, it has a relatively small impact. Figure 7 records the average of the total time spent on the corresponding operation. We gradually increased the transaction content, and the result show that the time spent on $Add$ and $Complete$ operations increases roughly linearly. Therefore, the blockchain performance has little impact compared with the overall detection process.

Fig. 6.

Effect of this correlation function and variable on time consumption in DRC.

Fig. 7.

Average total time spent by these $Add$ , $Get$ and $Complete$ functions in the blockchain.

When $ds / dbs$ was used as a variable, three sets of data values, namely $ds = 128, 256, 512$ and $dbs = 512, 1024, 2048$ , were tested. Figures 5-(c) and 5-(d) show the effect of $ds$ and $dbs$ on the computing time consumption during the data detection process, respectively. Obviously, the increase in the data replica and data block affect the algorithm’s computation, showing a non-linear increasing relationship.

Furthermore, our proposed method was separately tested and analyzed. When $cr$ was used as a variable, four sets of data $cr = 0.05, 0.1, 0.15, 0.2$ were tested. Figure 6-(a) shows that when $cr$ was changed from 0.05 to 0.2, there was not much difference in the time taken to detect a damaged replica of the data under these four conditions. It is worth noting that in our scheme, time polling and event trigger mechanisms were adopted during the detection process. Once the conditions are met, DRC checks the replica of the data, which causes all files to be uniformly verified no matter how much the $cr$ changes. Varying $cs$ produced the same result.

The above four schemes implement the verification process through the edge server, which inevitably generates additional communication overhead. For this reason, this paper tests and compares the network overhead of the four schemes and obtains the detection results according to the adjustment of the data replica scale. As shown in Fig. 8, we can see that the data replica scale significantly impacts the network overhead of the P-BLS, ICL and CE schemes. As the data replica scale increases, the network overhead will also increase. The reason is that P-BLS will need to transmit more data integrity certificates from the edge server to the cloud server, resulting in increased overhead; ICL and CE involve more edge servers for inspection, resulting in more network traffic. The DRC will be tracked to a specific edge server for repair according to the usage information of the data, and the impact is negligible. Therefore, the above results show that DRC can verify and repair many edge data copies without significant communication overhead.

Fig. 8.

Average communication cost of four schemes.

5.4. Theoretical analysis

DRC conducted the theoretical analysis and comparison for [17] in this section. The scheme in [17] is called CE, and it confirms the scheme of administrator by quantity. In CE, the administrator has certain permissions and only administrators can confirm correct data replicas. Furthermore, all edge servers must be replaced with these data replicas under edge computing as soon as the administrator confirms the correct data replicas. However, there are two conditions that must be met to confirm the replicas:

the number of identical false data replicas < the number of identical true data replicas;

the number of identical data replicas $> (n + 1) / 2$ , of which n is the total number of data replicas.

Therefore, it must be noted that if the number of identical false data replicas > the number of identical true data replicas, the true data will be overwritten by fake data. What is more serious is that additionally, all data replicas in the edge environments will be replaced with fake data replicas, and the method used by the administrators to confirm the correctness of the data replicas is difficult to control. Once the administrator betrays or misjudges their correctness, the load of an edge computing network will suffer under a serious influence.

Therefore, we assume that there are a total of n pieces of data replicas, and each data replica is assigned to a different edge server. The possibility of successfully tampering with the data replicas in each server is $P (x)$ , and the possibility of tampering into data replica Y is $P (z)$ . Followed by the conditional probability formula, the possibility of tampering with the data replicas in any edge server into Y is $\begin{matrix} (1) & \begin{array}{r} P^{'} = P (z | x) = \frac{P (z x)}{p (x)} . \end{array} \end{matrix}$ To realize that half of the data replicas in an edge server are to be tampered into Y, the possibility is: $\begin{matrix} (2) & \begin{array}{r} P^{″} = C_{n}^{\frac{n + 1}{2}} \cdot {(P^{'})}^{\frac{n + 1}{2}} {(1 - P^{'})}^{n - \frac{n + 1}{2}} . \end{array} \end{matrix}$ By analogy, to realize that more than half of the data replicas in an edge server are to be tampered into y, the sum of possibility is: $\begin{matrix} (3) & \begin{array}{r} P (SUM) = \sum_{i = n}^{2 n - 1} C_{n}^{\frac{i + 1}{2}} \cdot {(P^{'})}^{\frac{i + 1}{2}} {(1 - P^{'})}^{n - \frac{i + 1}{2}} . \end{array} \end{matrix}$ Taking Formula (2) in, that is: $\begin{matrix} (4) & \begin{array}{r} P (SUM) = \sum_{i = n}^{2 n - 1} C_{n}^{\frac{i + 1}{2}} {[\frac{P (z x)}{P (x)}]}^{\frac{i + 1}{2}} \cdot {[1 - \frac{P (z x)}{P (x)}]}^{n - \frac{i + 1}{2}} . \end{array} \end{matrix}$ Therefore, we can easily calculate the vulnerability index of CE. Once this vulnerability is successfully attacked, the edge server will spend considerable computing and storage resources serving the wrong data replicas. Based on the rights confirmation of the data, although DRC is a scheme that applies the idea of distributed management, it does not confirm the correctness of the data replicas using identity management. Similarly, we assume that in DRC, the possibility of successfully tampering with the data replicas in each edge server is $P (k)$ . When the data replicas in an edge server change, the edge server checks for the data. In any case, the verification of the blockchain is a necessary procedure. The data replicas in this edge server are invalid if the verification fails. Following the contents of DRC, if data replicas of other edge servers are also invalidated, $SP$ re-provides the correct data replicas, whose possibility is calculated by: $\begin{matrix} (5) & \begin{array}{r} P^{‴} = P {(k)}^{n} \end{array} \end{matrix}$ Regardless of whether the condition happens in DRC, what is more important is that the edge server in DRC will not continue to calculate or store incorrect data replicas, thus significantly saving the computing and communication resources of edge networks.

In the P-BLS scheme, when a damaged edge data replica is detected, the service provider of the application usually needs to transfer a complete data replica from the cloud server to the edge server to replace the damaged data replica. This involves the checking calculation of the edge server, the calculation comparison with the cloud server, the feedback and confirmation from the cloud server to the edge server, and the sending of a complete data replica by the cloud server, which will cause a lot of backhaul communication overhead. The ICL and CE solutions optimize the P-BLS solution and realize data replica restoration by selecting management edge servers. Although the communication with the cloud server is omitted, the selection of the management edge server needs to participate in some edge servers to obtain confirmation, so there is also some backhaul communication overhead. Moreover, if the data replica of the management edge server is damaged, the management edge server needs to be selected again, which will also increase communication overhead. The DRC scheme mainly obtains the information of the data replica by interacting with the blockchain, and each edge server will store the relevant information of the blockchain. When the data replica on an edge server is damaged, there is no need to find a repairable edge server directly. The blockchain can be used to find relevant data and use clues to directly communicate with the edge server with a complete copy of the data, which can better avoid network backhaul.

In addition, DRC does not consider the security of blockchain technology itself, but this characteristic is not the focus of this paper. Second, DRC does not adopt the public chain mode to improve the accuracy of identifying damaged data copies and the efficiency of repairing data copies. The basic idea of DRC is to solve the problem of distributed management by authentic right of data, and this strategy not only avoids some negative problems caused by centralization but also solves the data credibility itself, which ends the problem of false-negative detection results for data replicas from the source.

6. Conclusion

A lightweight protection scheme, called DRC and based on the authentic right of data, was proposed in this paper. The scheme focuses on realizing a two-layer consensus algorithm and establishing a relatively complete mechanism for the authentic right of data. Based on this right, a time polling mechanism and an event trigger mechanism are used for rapid detection to ensure that damaged data copies can be checked, located, and repaired in time. The experimental results show that DRC can not only effectively ensure the data integrity in mobile edge computing, but also solve the problems of wasted computing resources, network backhaul, and false-negative detection results from data copies. DRC additionally exhibits good performance.

Footnotes

Proof of computation in the Introduction

Assuming that each server has n computing tasks, m tasks from among the n computing tasks are cooperative computing tasks. When the error rate of a task is p and the tasks are in the calculation state in the time t, then the calculation can be obtained in the time t and the error rate of each server’s cooperative task is $1 - {(1 - p)}^{m}$ . The error rate for non-cooperative tasks is $1 - {(1 - p)}^{n - m}$ . In addition, if the time consumed by each server for the secondary calculation due to the failure of the calculation task is k, the total time required is $\begin{matrix} (6) & \begin{array}{r} (2 - [{(1 - p)}^{m} + {(1 - p)}^{n - m}]) * k \end{array} \end{matrix}$

Similarly, if the n computing tasks are all non-cooperative tasks, then the overall error probability is $1 - {(1 - p)}^{n}$ , and the total time required is $\begin{matrix} (7) & \begin{array}{r} [1 - {(1 - p)}^{n}] * k \end{array} \end{matrix}$ At this point, the following formula proves that Formula (7) is greater than Formula (6), that is, the influence of cooperative tasks is greater than that of non-cooperative tasks. When p is too high or the number of tasks increases, the cost of this damage will have a significant impact on the edge computing network.

Calculation proof process: For $2 - [{(1 - p)}^{m} + {(1 - p)}^{n - m}]$ and $1 - {(1 - p)}^{n}$ , We set $x = 1 - p$ , where $0 < p < 1$ . Then $0 < 1 - p < 1$ , $i e 0 < x < 1$ . The above two original formulas can be expressed as $2 - x^{m} - x^{n - m}$ and $1 - x^{n}$ , where $n > m > 0$ .

Let $\begin{matrix} (8) & Y = \frac{2 - x^{m} - x^{n - m}}{1 - x^{n}}, \end{matrix}$ because $n > M > 0$ , so $\begin{matrix} (9) & Y = \frac{2 - x^{m} - x^{n - m}}{1 - x^{n}} > \frac{1 - x^{m} + 1 - x^{n - m}}{1 - x^{m}} \end{matrix}$ Let $\begin{matrix} (10) & T = \frac{1 - x^{m} + 1 - x^{n - m}}{1 - x^{m}}, \end{matrix}$ so $\begin{array}{rcl} (11) & T & = & \frac{1 - x^{m} + 1 - x^{n - m}}{1 - x^{m}} \\ (12) & = & 1 + \frac{1 - x^{n - m}}{1 - x^{m}} \\ (13) & > & 1 \end{array}$ that is, $Y > 1$ . Therefore, $2 - {(1 - p)}^{m} - {(1 - p)}^{n - m} > 1 - {(1 - p)}^{n}$ .

References

Ahmed and

M.H.

Rehmani, Mobile edge computing: Opportunities, solutions, and challenges, 2017.

Ali Harchaoui,

Younes,

El Hibaoui and

Bendahmane, Survey and a new taxonomy of proofs of retrievability on the cloud storage, in: Proceedings of the 4th International Conference on Networking, Information Systems & Security, 2021, pp. 1–8.

Alladi,

Chamola,

Sahu and

Guizani, Applications of blockchain in unmanned aerial vehicles: A review, Vehicular Communications 23 (2020), 100249. doi:10.1016/j.vehcom.2020.100249.

Anthoine,

J.-G.

Dumas,

de Jonghe,

Maignan,

Pernet,

Hanling and

D.S.

Roche, Dynamic proofs of retrievability with low server storage, in: 30th USENIX Security Symposium (USENIX Security 21), 2021, pp. 537–554.

Ch,

Srivastava,

T.R.

Gadekallu,

P.K.R.

Maddikunta and

Bhattacharya, Security and privacy of uav data using blockchain technology, Journal of Information Security and Applications 55 (2020), 102670. doi:10.1016/j.jisa.2020.102670.

Cui,

He,

Li,

Xia,

Chen,

Jin,

Xiang and

Yang, Efficient verification of edge data integrity in edge computing environment, IEEE Transactions on Services Computing (2021).

Deng,

Zhao,

Fang,

Yin,

Dustdar and

A.Y.

Zomaya, Edge intelligence: The confluence of edge computing and artificial intelligence, IEEE Internet of Things Journal 7(8) (2020), 7457–7469. doi:10.1109/JIOT.2020.2984887.

Ding,

Lv,

Pang,

Hu,

Wang,

Yang and

Li, Privacy-preserving task allocation for edge computing-based mobile crowdsensing, Computers & Electrical Engineering 97 (2022), 107528. doi:10.1016/j.compeleceng.2021.107528.

Fan,

Wu and

H.-Y.

Paik, Dr-bft: A consensus algorithm for blockchain-based multi-layer data integrity framework in dynamic edge computing system, Future Generation Computer Systems 124 (2021), 33–48. doi:10.1016/j.future.2021.04.020.

10.

Fu,

Li,

Yu,

Yu and

Zhang, Dipor: An ida-based dynamic proof of retrievability scheme for cloud storage systems, Journal of Network and Computer Applications 104 (2018), 97–106. doi:10.1016/j.jnca.2017.12.007.

11.

S.S.

Gill, A manifesto for modern fog and edge computing: Vision, new paradigms, opportunities, and future directions, in: Operationalizing Multi-Cloud Environments, Springer, 2022, pp. 237–253. doi:10.1007/978-3-030-74402-1_13.

12.

Hao,

Chen,

Hu,

M.S.

Hossain and

Ghoneim, Energy efficient task caching and offloading for mobile edge computing, IEEE Access 6 (2018), 11365–11373. doi:10.1109/ACCESS.2018.2805798.

13.

Hassan,

K.-L.A.

Yau and

Wu, Edge computing in 5g: A review, IEEE Access 7 (2019), 127276–127289. doi:10.1109/ACCESS.2019.2938534.

14.

Ji,

Shao,

Chang and

Bian, Privacy-preserving certificateless provable data possession scheme for big data storage on cloud, revisited, Applied Mathematics and Computation 386 (2020), 125478. doi:10.1016/j.amc.2020.125478.

15.

Kai,

Zhou,

Yi and

Huang, Collaborative cloud-edge-end task offloading in mobile-edge computing networks with limited communication capability, IEEE Transactions on Cognitive Communications and Networking 7(2) (2020), 624–634. doi:10.1109/TCCN.2020.3018159.

16.

Kumar,

Harjula,

Ejaz,

Manzoor,

Porambage,

Ahmad,

Liyanage,

Braeken and

Ylianttila, Blockedge: Blockchain-edge framework for industrial iot networks, IEEE Access 8 (2020), 154166–154185. doi:10.1109/ACCESS.2020.3017891.

17.

Li,

He,

Chen,

Dai,

Jin,

Xiang and

Yang, Cooperative assurance of cache data integrity for mobile edge computing, IEEE Transactions on Information Forensics and Security 16 (2021), 4648–4662. doi:10.1109/TIFS.2021.3111747.

18.

Li,

Ren,

Wu,

Ji,

Yu,

Cao and

Wang, Blockchain-based mobile edge computing system, Information Sciences 561 (2021), 70–80. doi:10.1016/j.ins.2021.01.050.

19.

Li,

Yan and

Zhang, Efficient identity-based provable multi-copy data possession in multi-cloud storage, IEEE Transactions on Cloud Computing (2019).

20.

Li,

Gao,

Zhao and

Shen, Deep reinforcement learning for collaborative edge computing in vehicular networks, IEEE Transactions on Cognitive Communications and Networking 6(4) (2020), 1122–1135. doi:10.1109/TCCN.2020.3003036.

21.

Li,

Xie,

Sun et al., A survey of mobile edge computing, Telecommunications Science 34(1) (2018), 87.

22.

W.Y.B.

Lim,

Xiong,

Niyato,

Cao,

Miao,

Sun and

Yang, Realizing the metaverse with edge intelligence: A match made in heaven, 2022, arXiv preprint arXiv:2201.01634.

23.

Lv and

Qiao, Optimization of collaborative resource allocation for mobile edge computing, Computer Communications 161 (2020), 19–27. doi:10.1016/j.comcom.2020.07.022.

24.

Mao,

You,

Zhang,

Huang and

K.B.

Letaief, A survey on mobile edge computing: The communication perspective, IEEE communications surveys & tutorials 19(4) (2017), 2322–2358. doi:10.1109/COMST.2017.2745201.

25.

Mehta,

Gupta and

Tanwar, Blockchain envisioned uav networks: Challenges, solutions, and comparisons, Computer Communications 151 (2020), 518–538. doi:10.1016/j.comcom.2020.01.023.

26.

Ni,

Zhang,

Yu,

Lin and

X.S.

Shen, Providing task allocation and secure deduplication for mobile crowdsensing via fog computing, IEEE Transactions on Dependable and Secure Computing 17(3) (2018), 581–594. doi:10.1109/TDSC.2018.2791432.

27.

Safavat,

N.N.

Sapavath and

D.B.

Rawat, Recent advances in mobile edge computing and content caching, Digital Communications and Networks 6(2) (2020), 189–194. doi:10.1016/j.dcan.2019.08.004.

28.

T.X.

Tran,

Hajisami,

Pandey and

Pompili, Collaborative mobile edge computing in 5g networks: New paradigms, scenarios, and challenges, IEEE Communications Magazine 55(4) (2017), 54–61. doi:10.1109/MCOM.2017.1600863.

29.

Tuli,

Mahmud,

Tuli and

Buyya, Fogbus: A blockchain-based lightweight framework for edge and fog computing, Journal of Systems and Software 154 (2019), 22–36. doi:10.1016/j.jss.2019.04.050.

30.

Wang,

Wu,

K.-K.R.

Choo and

He, Blockchain-based anonymous authentication with key management for smart grid edge computing infrastructure, IEEE Transactions on Industrial Informatics 16(3) (2019), 1984–1992. doi:10.1109/TII.2019.2936278.

31.

Xu,

Hang,

Jin and

Kim, Distributed secure edge computing architecture based on blockchain for real-time data integrity in iot environments, in: Actuators, Vol. 10, MDPI, 2021, p. 197.

32.

Yu,

M.H.

Au,

Ateniese,

Huang,

Susilo,

Dai and

Min, Identity-based remote data integrity checking with perfect data privacy preserving for cloud storage, IEEE Transactions on Information Forensics and Security 12(4) (2016), 767–778. doi:10.1109/TIFS.2016.2615853.

33.

Zhang,

Leng,

He,

Maharjan and

Zhang, Cooperative content caching in 5g networks with mobile edge computing, IEEE Wireless Communications 25(3) (2018), 80–87. doi:10.1109/MWC.2018.1700303.

34.

Zhang,

Peng,

Wang,

Jin,

Su and

Chen, Secure and efficient data storage and sharing scheme for blockchain-based mobile-edge computing, Transactions on Emerging Telecommunications Technologies 32(10) (2021), e4315. doi:10.1002/ett.4315.

35.

Zhang and

J.-H.

Lee, A group signature and authentication scheme for blockchain-based mobile-edge computing, IEEE Internet of Things Journal 7(5) (2019), 4557–4565. doi:10.1109/JIOT.2019.2960027.

36.

Zhang and

Wang, Hybrid malware detection approach with feedback-directed machine learning, Information Sciences 63(139103) (2020), 1–139103.

37.

Zhao,

Xu and

Chen, A security-enhanced identity-based batch provable data possession scheme for big data storage, KSII Transactions on Internet and Information Systems (TIIS) 12(9) (2018), 4576–4598.

38.

Zhao,

Ding,

Wang,

Wang and

Liu, A privacy-preserving tpa-aided remote data integrity auditing scheme in clouds, in: International Conference of Pioneering Computer Scientists, Engineers and Educators, Springer, 2019, pp. 334–345.

39.

Zhou, A certificate-based provable data possession scheme in the standard model, Security and Communication Networks 2021 (2021).