Abstract
As an important part of digital building, building internet of things (BIoT) plays a positive role in promoting the construction of smart cities. Existing schemes utilize blockchain to achieve trusted data storage in BIoT. However, the full-copy storage mechanism of blockchain and the management requirements of massive data have brought computing and storage challenges to edge nodes with limited resources. Therefore, a data management scheme for BIoT based on blockchain sharding is proposed. The scheme proposes a hybrid storage mechanism, which uses inter-planetary file system (IPFS) to ensure the integrity and availability of data outside the chain, and reduces the storage overhead of edge nodes. Based on the hybrid storage mechanism, the sharding algorithm is designed to divide the blockchain into multiple shards, and the storage overhead and computing overhead are offloaded to each shard, which effectively balances the computing and storage overhead of edge nodes. Finally, comparative analysis was made with existing schemes, and effectiveness of proposed scheme was verified from the perspectives of storage overhead, computation overhead, access delay and throughput. Results show that proposed scheme can effectively reduce storage overhead and computing overhead of edge nodes in BIoT scenario.
Keywords
Introduction
With the advance of urbanization, the contradiction between people’s demand for high-quality life and current lagging urbanization services is becoming increasingly prominent [1]. Therefore, the concept of “smart city” was introduced. Smart cities use information technology to integrate and manage physical, social and commercial infrastructure, which can achieve the best use of resources and provide better services for urban residents [2].
Building is the basic unit and organic composition of a city. The development of smart city also needs the digital construction of buildings. According to the data provided by Global Alliance for Buildings and Construction, in 2020, the energy consumption and energy related CO2 emissions in construction industry accounted for 36% and 37% of the global total [3]. It can be seen that the digital building project plays a very important role in promoting energy conservation and emission reduction for smart cities, providing residents with high-quality and comfortable environment. As a new paradigm, the Internet of Things (IoT) is accelerating its integration with digital buildings [4]. Building internet of things (BIoT) uses IoT technology to connect different objects, sensors and terminal devices inside the building with the Internet [5], so as to manage the data generated from the building. Assisted by building information modeling (BIM) technology, BIoT provides efficient building management services [6] and smart application services [7] for the residents inside the building. It promotes energy conservation and emission reduction, reduces management costs, and also plays a positive role in promoting intelligent development of cities.
Traditional BIoT system usually adopts the centralized management mode [8], which brings the following challenges:
Single-point failure: The failure of central server will paralyze the whole system [9]. Crisis of trust: Data is usually stored in a third-party cloud, which has the trust problem of data. In addition, data is vulnerable to security attacks, resulting in data loss or tampering, which will lead to the poor reliability.
Blockchain is considered as an emerging technology for constructing secure and trusted environment due to its characteristics of multi-party consensus, decentralized storage and trusted traceability [10]. Singh et al. [11] proposed a drug supply chain system based on blockchain and IoT, which realized the trusted tracking of drugs and guaranteed the security of data. Yazdinejad et al. [12] designed a secure and low proof of work protocol, which provided a blockchain-based green and energy efficient solution for IoT scenarios. Similarly, blockchain technology can effectively solve the existing problems in traditional BIoT system. Researchers [13, 14] built the blockchain-based distributed management framework, stored BIoT data in blockchain, realized the trusted storage of data and provided the system with partition tolerance. Moreover, some studies [15, 16] combine blockchain with edge computing to process and store data on the side close to building, so as to further improve data integrity and availability.
However, with the popularity of IoT applications, the number of IoT devices deployed in building continues to increase, resulting in an increasing amount of IoT data [17]. The architecture of blockchain and edge computing will inevitably bring computing and storage challenges to resource-constrained nodes deployed close to the building [18]. Take an office building in Qingdao, China as an example. The building generates 140 million IoT data every year, with the storage space accounts for about 110 GB. Considering the full-copy storage mechanism of blockchain, if the data is directly stored in blockchain, it will bring great challenges to the storage capacity of resource-constrained nodes, and also affect the efficiency of non-blockchain tasks.
Sharding technology [19] is a mainstream method to solve the above problems. It divides the entire nodes into several relatively independent subchains (i.e., shards). Nodes in each shard only handle transactions and store data related to themselves, thereby offloading computing and storage pressure of blockchain network to different shards. However, it is rare to apply sharding technology to BIoT scenario.
Therefore, this paper proposes a data management scheme for BIoT based on blockchain sharding, which includes:
Providing a hybrid storage mechanism based on blockchain, which comprehensively adopts on-chain and off-chain storage modes according to the characteristics of the data itself, uses inter-planetary file system (IPFS) to ensure the integrity and availability of data stored in off-chain, and reduces storage overhead of resource-constrained edge nodes. Designing a sharding algorithm on the basis of hybrid storage mechanism. According to the distribution of devices and data quantities in each building intelligent subsystem, blockchain network is divided into multiple shards, which can effectively balance the computing and storage overhead of edge nodes.
The rest of the paper is organized as follows. Section 2 reviews the related works. Section 3 analyzes the application requirements of blockchain and sharding technology in BIoT scenario. Section 4 describes the proposed scheme in detail. Section 5 verifies the effectiveness of proposed scheme through scheme comparisons and simulation experiments. Finally, conclusions and future works are drawn in Section 6.
Digital building is the foundation of smart city [20]. As a basic part of digital building, BIoT plays a very important role in promoting energy conservation for smart city and providing residents with comfortable environment.
In this section, this paper summarizes the related works of BIoT, and points out the performance challenges of storage and computing faced by existing blockchain-based schemes. Finally, this paper analyzes the optimization methods for storage and computing in blockchain.
Building internet of things
System architecture of traditional building internet of things.
As shown in Fig. 1, BIoT system can be regarded as three-layer architecture with end, edge and cloud. As the boundary, the edge layer divides the network into Local Area Network (end-edge) and Wide Area Network (edge-cloud) to improve the security of internal system. 1) The end layer contains various IoT devices in the building, which are usually deployed by device manufacturers in the phase of construction and constitute the intelligent subsystems (e.g., video monitor system, building automation subsystem). Each device integrates a variety of sensors to monitor the external environment or its operating status. 2) The edge layer is mainly composed of several edge gateways deployed inside the building. Each gateway connects with one or more subsystems based on the communication protocol provided by device manufacturers, and is responsible for collecting device data, forwarding data to the cloud and controlling IoT devices deployed in end layer. 3) The cloud layer is mainly responsible for storing data generated by IoT devices, providing visualization management, responding to user’s requests, and sending control instructions to the gateway.
BIoT system can realize the whole process of operation and management (O&M) from data perception, collection, integration, analysis to decision-making, so as to improve O&M efficiency, promote energy conservation and emission reduction, and serve the upper smart applications for users. Bottaccioli et al. [21] designed a building energy model to simulate, predict, and reduce unnecessary energy consumption. Based on IoT real-time data generated in the building, Wang et al. [7] designed a dynamic fire escape path planning method to provide evacuation path in fire scenarios and meet firefighting needs of complex buildings.
However, the deep integration of IoT and construction industry has gradually exposed the shortcomings of BIoT applications: 1) Single-point failure exists in the central storage and management mode. 2) The use of data requires the basis of trusting third-party cloud, which has the problems of data security and trust [9].
Blockchain is a data chain jointly maintained by multiple parties, which utilizes cryptography to ensure that the data cannot be tampered with. With its unique low-cost trust mechanism, blockchain is considered as one of the indispensable technologies for constructing a new trust system in the future [10].
The combination of blockchain and BIoT can realize trusted data storage and solve the problems existing in traditional management mode. Xu et al. [22] stored IoT data generated from the building in blockchain, thereby realizing the distributed and trusted storage for data. Van et al. [23] designed a blockchain-based decentralized architecture which used smart contracts to achieve building energy management and ensure data reliability. Rahman et al. [24] proposed a smart building architecture based on blockchain, which stores the data in blockchain to ensure the security, privacy and transparency of sensors. To further ensure the integrity and availability of data, Wang et al. [16] deployed blockchain in the edge gateway and cloud server at the same time, enabling data to be stored in the edge layer. Meanwhile, this study adopted directed acyclic graph (DAG) to improve the throughput of blockchain network and reduce the latency of consensus.
The above studies utilize blockchain technology to achieve trusted data storage in BIoT. However, due to the full-copy storage mechanism of blockchain and massive data management requirements of BIoT scenario, on-chain storage mode and single chain architecture will bring great challenges to the storage and computing performance of blockchain nodes, especially for resource-constrained nodes in the edge layer.
Researches for storage and computing optimization in blockchain
This subsection summarizes the existing research from two dimensions of storage optimization and computing optimization, and analyzes their applicability in BIoT scenario.
(1) Storage optimization
To solve the problem of high storage overhead, Lan et al. [25] proposed a storage framework combining blockchain and cloud to realize off-chain data storage. They used asymmetric encryption algorithm to ensure the security and reliability of data stored in cloud. Gochhayat et al. [26] proposed an IoT data storage scheme based on blockchain, which stores the original data in cloud and metadata in blockchain to relieve the storage pressure of blockchain. Cloud storage can effectively reduce the storage overhead of blockchain nodes, but cannot guarantee the availability of off-chain data [27]. Therefore, Lu et al. [28] utilized IPFS to realize distributed and reliable off-chain storage for IoT data, and only metadata is stored in blockchain for indexing and validation.
Considering the data characteristics of BIoT:
The IoT data generated by devices (deployed in the end layer) can be collected, processed and stored by the edge node on a regular periodically. For fault alarm and device management information (e.g., remote control instructions to devices, repair or maintenance records of devices), they are generated from upper application services, and have the characteristics of small in scale, variable in generation frequency, and needs to be sequential processed (one by one).
Storing type ii) data off-chain is difficult to effectively reduce the storage overhead of blockchain nodes (the size of single original data is close to the metadata), but will increase the operation complexity and cause the inconvenience in data management. Therefore, different management modes need to be designed according to the characteristics of data.
(2) Computing optimization
Sharding technology is a mainstream method to solve the problem of high computing overhead. The core principle of sharding technology is to divide blockchain network into multiple shards according to certain rules [19]. Each shard only handles transactions and stores data related to it. Since nodes in different shards can handle disjoint transactions in parallel, computing tasks of blockchain can be offloaded to different shards, thereby reducing the computing pressure of nodes [29]. Moreover, fragmented storage of global ledger can also effectively relieve the storage pressure of blockchain nodes [30].
Li et al. [31] proposed an expansion and sharing scheme based on sharding technology to solve the scalability limitation in blockchain. Wang et al. [32] constructed a distributed storage scheme based on blockchain sharding, which randomly divided blockchain nodes into different shards to effectively reduce the computing and storage overhead of nodes. However, existing sharding schemes mainly adopt node-oriented division method, which does not fully match the data-oriented BIoT scenario. The reasons are as follows:
The existing scheme focuses on solving the performance challenges brought by large-scale joining of blockchain nodes to the network. While in BIoT scenario, it focuses on managing data and devices in the building. The number of edge nodes (i.e., edge gateways) and cloud service nodes (i.e., cloud severs) providing smart services for single building is relatively fixed, and the situation of large-scale joining or exiting is not exist. The existing scheme focuses on evenly, randomly and fairly dividing nodes into multiple shards. However, for BIoT, the edge node forms a “one to many” relationship with subsystems, and the amount of devices and data in each subsystem vary greatly due to the various types of buildings. For example, office buildings have the largest number of air-conditioning and monitoring devices, while residential buildings have the largest number of water meters and electric meters. Existing sharding methods may divide multiple edge nodes managing complex subsystems (the large amount of devices and data) into the same shard, resulting in high computing and storage overhead in the shard, and bringing great challenges to the edge nodes within the shard.
To sum up, existing studies provide an important reference for this paper, but the characteristics of BIoT scenario should also be taken into account during the design of scheme.
In order to determine the goals of scheme and carry out reasonable design, this section analyzes the application requirements of blockchain and sharding technology in BIoT scenario.
BIoT involves the joint cooperation of multiple organizations (building operator, property company, residents, etc.) and has admission control for nodes, which can be classified as the scenario of alliance chain. Taking the mainstream platform Hyperledger Fabric as an example to consider performance requirements, when deploying the blockchain network, node requires at least 300 MB of additional memory to run a single peer node [33]. Therefore, in BIoT system, physical installations that can deploy the blockchain environment mainly include: edge gateways and cloud servers.
Existing schemes usually deploy the above installations as blockchain nodes in the single chain to achieve trusted storage of BIoT data. However, due to the full-copy storage mechanism of blockchain, as well as the dynamic growth of transactions and data (caused by the expansion of devices scale), the existing scheme faces the performance challenges of computing and storage for edge nodes.
Sharding technology is an effective method to achieve on-chain scaling and balance the computing and storage load of edge nodes. However, when applying sharding technology to BIoT scenario, the following points should be considered:
As an independent and complete functional unit, the intelligent subsystem is indivisible during sharding. For buildings (or parks) with different types and scales, there are great differences in the composition of subsystems, the scale of devices and data. Consequently, shards need to be configured in combination with building’s conditions.
Aiming at the storage and computing overhead of edge nodes, this paper provides a data management framework for BIoT based on blockchain and sharding technology. In this section, provided data management architecture is first introduced. After that, hybrid storage mechanism and sharding algorithm of blockchain are elaborated. Finally, the key process of data management is described by taking vehicle access records as an example.
Overview of framework
For BIoT, this paper provides a data management scheme based on blockchain sharding. The framework is shown in Fig. 2, and the specific functions of core roles are as follows:
Data management framework based on blockchain and sharding technology.
Edge gateway: 1) Edge gateways and the cloud servers (cloud service nodes) together form the blockchain network. 2) Each gateway corresponds to one or more subsystems. It is responsible for collecting IoT data (e.g., operating status of devices, external environment information) generated from the subsystem, and stores them to IPFS (off-chain storage) or corresponding shard (on-chain storage) according to the hybrid storage mechanism (see Section 4.2). Cloud server: 1) Due to the sufficient resources of the cloud server, it is added to all shards as the blockchain node, and constructs IPFS private network with other cloud servers. 2) It is responsible for receiving and responding to the requests generated by users. Blockchain network: 1) Blockchain network is deployed in edge layer and cloud layer at the same time. It is divided into multiple shards through the sharding algorithm shown in Section 4.3. 2) Each shard corresponds to one or more intelligent subsystems in BIoT scenario, and is responsible for storing original data or metadata generated from the subsystem. IPFS network: It is deployed in cloud layer. After the edge gateway collects IoT data periodically, gateway encrypts data and stores them in IPFS according to the hybrid storage mechanism. Returned storage address is stored in the corresponding shard as metadata.
Existing blockchain-based schemes for BIoT generally manage data through the on-chain storage mode, which will bring great storage pressure to blockchain nodes, especially to edge nodes. Scheme [34] stored data in central cloud to reduce the requirements of on-chain storage. However, cloud storage mode cannot ensure data integrity and availability.
Storage modes of data
Storage modes of data
Considering the mismatch between single storage mode and the diverse needs of data management in BIoT scenario, this paper designs a hybrid storage mechanism based on the characteristics of BIoT data. As shown in Table 1, for the massive IoT data that can be collected, stored, and accessed in batches, IPFS is used to store the original data, while only the metadata is stored on-chain. For O&M data (e.g., devices management information, fault alarm information), it has the characteristics of small in scale, variable in generation frequency and sequential process. The original data and metadata belong to the “one-to-one correspondence” relationship, which is difficult to effectively reduce the storage occupancy of nodes, and will increase the complexity of data storage and access. Therefore, O&M data is directly stored on-chain after encryption.
The step of storing data in BIoT scenario is shown in Fig. 3, which includes:
The gateway collects IoT data from devices (or the cloud server generates O&M data by interacting with users), and then encrypts the data. The gateway (or cloud server) determines the storage mode and selects the shard according to Table 1 and sharding algorithm (see Section 4.3.1). If the storage mode is “On-chain storage”, go to step (6); otherwise, perform step (4). “Off-chain storage” mode is adopted to store the encrypted data into IPFS and receive returned storage address. The storage address, hash value of original data, and other metadata are stored in the destination shard. The process ends. “On-chain storage” mode is adopted to store encrypted data and hash value of original data in the destination shard. The process ends.
Process of hybrid storage.
Although storing IoT data in IPFS can alleviate the storage pressure to a certain extent, with the exponential growth of IoT devices and the full-copy storage mechanism of blockchain, storage mode of single-chain will still face the storage challenges caused by resource-constrained edge nodes. Therefore, this paper designs a sharding algorithm of blockchain based on the characteristics of BIoT scenario.
Division of sharding
Existing schemes generally focuses on evenly, randomly and fairly dividing nodes into multiple shards for sharding [31, 32]. The random sharding method evenly divides the blockchain network into several shards to improve the scalability of blockchain. However, this node-oriented sharding method does not match the data-oriented management requirements of BIoT system, which is difficult to balance the computing and storage overhead of each shard. Therefore, this paper designs a data-oriented sharding algorithm, which takes the distribution of devices and data quantities generated from each subsystem as the criteria for sharding. The specific sharding steps are as follows:
Initialize parameters: the number of intelligent subsystems (marked as Normalize Calculate the weighted sum of each subsystem: Calculate the theoretical optimal value of shard: Continue to partition other subsystems to minimize the sum of variance: Output the divided set that records the relationship between each shard and subsystem.
From Algorithm 1, it can be seen that the parameters affecting the performance of designed algorithm mainly include
After the subsystem is divided (i.e., data division), blockchain nodes need to be divided. The divided blockchain network is shown in Fig. 4. Considering that cloud service nodes have sufficient resources, and need to obtain various types of data when interacting with users and gateways, each cloud server is added to all shards as the blockchain node (Assuming there are
Sharding architecture of blockchain network.
Since the subsystem is relatively independent and can be operated independently, there is no direct interaction between the shards. The operation of multiple shards caused by subsystem linkage can be realized through cloud server.
Since IoT devices in end layer do not have the ability to run blockchain node, blockchain services are typically built in edge layer and cloud layer. The installations in edge layer and cloud layer are relatively fixed after deployment. There is no phenomenon of large-scale joining or exiting after the operation of shards. The phenomenon only exists in: 1) the replacement of edge gateway caused by a failure; 2) new edge gateway is introduced to alleviate computing pressure as the number of devices increases; 3) cloud node joins or exits for backup, data transfer, or other purposes.
The process of node joining and exiting is shown in Fig. 5. Since BIoT belongs to the scenario of alliance chain, the node (marked as Node
The process of node exiting is similar, but the exit request
Node joining and exiting.
The process of data management includes two phases: data storage and data access. This section takes vehicle access records as an example to introduce the process in proposed scheme.
Assuming that: 1) sharding algorithm has been executed, and parking subsystem is classified into shard 2 (i.e., vehicle access records can be stored and accessed from shard 2); 2) configuration information of shard and storage mode of data are recorded in gateway and cloud server; 3) symmetric key is generated by gateway and passed to cloud server through a secure channel; 4) user’s key pair is generated and the public key is exposed to public.
As shown in Fig. 6, the steps of data management are as follows:
Phase 1: Data storage
1) The edge gateway connects with the parking management system based on TCP/IP or other communication protocols, collects vehicle access record (marked as data) generated by one vehicle identification device.
2) The gateway uses symmetric key to encrypt data and get the cipher text (marked as
3) According to the type of data, “On-chain storage” mode is adopted. The gateway stores
4) Metadata (marked as meta) is constructed and sent to shard 2 on the basis of shard configuration. After consensus, meta is stored in shard 2.
The definition of meta is as follows:
The contents represent storage mode, cipher text, storage address, data type, index, hash value of original data, and storage time respectively. index is json-formatted data with dictionary type, it provides index for data access. For example, the value of index can be a combination of year, month and day, which can be used to obtain vehicle access records for a specified time range.
Phase 2: Data access
5) Property manager U1 intends to obtain vehicle access records for analyzing the trend of vehicle flow and total charges in order to assist in the formulation of next parking charges. U1 can initiate a data request
The definition of
Where user is the identity of U1; sign is the signature of the first 5 fields of
6) The server receives
7) meta is parsed, and
8) The server decrypts
9) U1 receives
The process of on-chain storage is similar, while the difference is only in the location of storing and accessing data. Therefore, this paper will not repeat the steps of on-chain storage.
Key processes of data management.
This section verifies the validity of proposed scheme in this paper. Section 5.1 compares and analyzes proposed scheme and existing schemes. Section 5.2 describes the simulation environment. Section 5.3 tests the performance of proposed scheme from the dimensions of storage overhead, computing overhead, access latency, and throughput. Finally, Section 5.4 analyzes the security.
Scheme comparison
As shown in Table 2, scheme [16] focuses on improving consensus performance (i.e., throughput, consensus latency), storing data directly on a single blockchain, resulting in higher storage and computing overhead at edge nodes. Scheme [34] stores the original data in the off-chain central database while the metadata is stored in blockchain, thereby reducing the storage pressure of blockchain nodes. However, centralized off-chain storage will reduce the availability of data.
In view of the above problems, this paper and scheme [28] use IPFS to realize the trusted off-chain storage and ensure the integrity and availability of data. Different from the scheme [28], this paper provides a hybrid storage mechanism based on the characteristics of BIoT data, and also designs a sharding algorithm to divide the blockchain network, thereby effectively reducing the computing and storage overhead of edge nodes.
Although scheme [32] uses sharding technology, its random sharding may cause high computing and storage pressure on the certain shard, which is difficult to effectively reduce the computing overhead of edge nodes. In contrast, this paper takes the scale of devices and data in each intelligent subsystem as the standard for sharding. The algorithm is more in line with the characteristics of BIoT, which can effectively reduce the computing overhead of edge nodes. In addition, scheme [32] also adopts the on-chain storage mode for data management, which requires high storage performance of blockchain nodes.
Simulation environment
Take an office building in Qingdao, China as an example. There are 1030 IoT devices, 3 edge gateways and 4 cloud servers deployed in its physical environment. Table 3 shows the frequency of sampling and the scale of devices and daily data in each intelligent subsystem. The building generates an average of 391314 data every day, and the O&M project of this building has been running for 2 years. It is estimated to generate approximately 12000 fault alarm data and device O&M data every day. The above information will be used as a reference for simulation evaluation in Section 5.3.
Scheme comparison and analysis
Scheme comparison and analysis
The scale of devices and daily storage in a certain office building
Environment configuration of software
Considering the high economic cost of actual deployment, this paper uses 3 Raspberry Pi 4B (4-core CPU, 2.0GHz, 2GB memory) and 4 personal computers (4-core CPU, 2.8GHz, 32GB memory) in the laboratory to simulate edge gateways and cloud servers in the above building. The Raspberry Pi is used to generate the same amount of data as shown in Table 3 to simulate the physical environment.
Raspberry Pi 4B has the following characteristics:
Powerful CPU and processing capability, which can be used as a central processor. Multiple complex software fragments can be run in parallel, and the communication function is complete, which can be used to connect external sensors and devices.
The performance and function of Raspberry Pi 4B is similar to the above-mentioned edge gateway [35] in the office building. Therefore, test results of the office building can be effectively simulated by using the above environment. The software configuration of Raspberry Pi and personal computer (PC) is shown in Table 4.
In the simulation experiment, the number of shards is set to 3, and parameters
Sharding result of blockchain network
This subsection simulates and analyzes proposed scheme in this paper from four dimensions: storage overhead, computational overhead, access latency and throughput. Storage and computing overhead are used to verify the effectiveness of designed sharding algorithm. Access latency and throughput are used to evaluate the availability of proposed scheme in the actual BIoT scenario.
Analysis of storage overhead
Assuming that: 1) the number of shards is
Therefore, the size of IoT data generated by the subsystem every day is
The daily storage overhead of each scheme is shown in Table 6. The scheme [16] adopts the single-chain architecture that stores all data in blockchain, so the storage overhead of edge nodes and cloud servers is
Comparatively, this paper adopts a new sharding algorithm and hybrid storage mode to reduce the storage overhead of edge nodes to
Subsequently, generating the five-day amount of data in the office building for storage, testing the storage overhead of proposed scheme in this paper and comparing it to existing schemes.
Comparison of daily storage overhead (theoretical analysis)
Comparison of daily storage overhead (theoretical analysis)
Comparison of daily storage overhead (ignore the redundancy).
Storage overhead.
As shown in Fig. 7, the storage overhead of on-chain data in this paper is 170.7 MB, which is significantly lower than that of scheme [16] and scheme [32], but slightly higher than scheme [28]. This is because when storing data with the same amount of data, sharding method adopted in this paper will generate more signatures, making the storage overhead of on-chain data slightly higher than that of scheme [28]. Moreover, the storage of metadata also makes the total occupancy of this paper and scheme [28] higher than scheme [16] and scheme [32]. However, focusing on the edge node, it can be seen from Fig. 8 that the storage overhead in this paper is significantly lower than that in other schemes, which proves that our proposed scheme can effectively reduce the storage overhead of edge nodes.
The theoretical analysis of computing overhead is similar to storage overhead. Assuming that: 1)
The daily computing overhead of each scheme is shown in Table 7. In scheme [28], the overhead of edge node is
As shown in Table 7, the computing overhead of cloud service node in this paper is significantly lower than that in scheme [16] and scheme [28], while slightly higher than that of scheme [32]. This is because the cloud service node in this paper needs to process both metadata and original data, while in scheme [32] only needs to process the original data stored in shards.
As shown in Fig. 9, the computing overhead of a node includes processing IoT data and corresponding metadata. The amount of metadata can be extrapolated based on the sampling frequency in Table 3. It can be seen that the computing overhead of each edge node decreases significantly after sharding. Since building automation subsystem generates more data in the office building, the overhead of Edge_Node_3 is higher than that of other edge nodes. In addition, considering the convenience of data management, this paper does not further divide devices or data within single subsystem.
Comparison of daily computing overhead (theoretical analysis)
Comparison of daily computing overhead (theoretical analysis)
Analysis of computing overhead in proposed scheme.
Combined blockchain with IPFS can effectively reduce the storage overhead of blockchain network. However, off-chain storage may cause the excessive latency in data access [27]. Therefore, this paper tests and compares the access latency of off-chain storage and on-chain storage in proposed scheme.
Firstly, on the basis of Fig. 6, compositions of access latency with on-chain and off-chain storage are analyzed respectively. As shown in Eqs (1) and (2),
For the data with same scale, compared with on-chain storage mode, off-chain storage requires additional time-consuming in the phase of data access. The additional time-consuming is shown in Eq. (3),
Subsequently, a total of 1000, 2000, 3000, 4000, 5000 and 6000 data are stored through on-chain and off-chain storage mode respectively (the storage space occupied by each encrypted data is about 0.8 KB). The Autocannon [36] is used to simulate the user to initiate 1000 requests for data access and tests the average access latency of the two modes. Test results are shown in Fig. 10. It should be added that in order to exclude the influence of network fluctuations (or other factors), the test of average latency is not included
As shown in Fig. 10, with the increase of the amount of accessed data, the growth trend of access latency in off-chain storage is significantly faster than that in on-chain storage. In addition, according to Eq. (3), since the size of metadata is irrelevant to original data (i.e.,
Average latency of accessing data.
Throughput of proposed scheme.
For storing time series data (i.e., IoT data) and non-time series data (i.e., device O&M data) generated from BIoT system, the throughput of blockchain network needs to be higher than the generation rate of data. Therefore, transaction throughput of proposed scheme is tested here. The test environment is set as follows: transactions of data storage are initiated to each shard at the sending rate of 60, 120, 180, 240, 300, and 360 transactions per second (TPS).
As shown in Fig. 11, the maximum transaction throughput of proposed scheme in this paper is 214.6TPS, and decreases subsequently due to the excessive data and high occupancy of memory.
Combined with the data provided in Table 3 and Fig. 11, it can be seen that the generation rate of BIoT data is much smaller than the maximum throughput of proposed scheme, indicating that the scheme can meet the processing requirements of data.
Analysis of security
In this section, triad of information security [37] is used to discuss the security of proposed scheme from the perspective of confidentiality, integrity and availability.
(1) Confidentiality
In this data management framework proposed in this paper, user does not directly join the blockchain network, and the data is encrypted and stored in blockchain or IPFS, so as to avoid direct exposure of data to users. In addition, request initiated by user needs to be processed uniformly by cloud server. The required data will be returned after the identity of user is judged to be legitimate, so as to prevent illegal users from obtaining data.
(2) Integrity
In proposed scheme, blockchain and IPFS are integrated into the management of BIoT system. The characteristics (e.g., distributed storage, multi-consensus) of blockchain are used to ensure the integrity of on-chain data. Since IPFS network addresses according to the hash value of data content, once the data is tampered with, data and hash value of the attacked node will change, and will not affect the data stored in other IPFS nodes. According to the hash value (i.e., storage address of original data), required data can still be retrieved from other IPFS nodes.
(3) Availability
Traditional centralized data management has the problem of single-point failure, which leads to the poor availability of BIoT system. This paper combines blockchain technology and IPFS, stores data generated from BIoT system in distributed on-chain of off-chain environment, which can provide partition tolerance for the system. If one server failed, users can still request services from other nodes, thereby improving the robustness of system.
Conclusion
Digital building is the basic unit of smart city, and BIoT is an important part of digital building, which plays a positive role in promoting energy conservation for city and providing residents with comfortable environment. Existing studies apply blockchain to BIoT scenario, solving the trust problem and security problem existing in traditional centralized management model. However, the full-copy storage mechanism of blockchain and the management requirements of massive data have brought computation and storage challenges to edge nodes with limited resources. Meanwhile, the existing optimization solutions for blockchain and IoT are difficult to fit the requirements of BIoT scenario.
Therefore, this paper proposes a data management scheme for BIoT based on blockchain sharding. Firstly, hybrid storage mechanism based on blockchain and IPFS is provided, which comprehensively adopts on-chain and off-chain storage modes according to the characteristics of the data itself, thereby reducing the storage overhead of resource-constrained nodes. Secondly, data-oriented sharding algorithm is designed to offload the computing workload to multiple shards. Finally, Simulation tests are conducted based on Hyperledger Fabric. Results show that proposed scheme in this paper can effectively balance the computing and storage overhead of edge nodes.
This study is not free from limitations:
The high latency problem in data access caused by off-chain storage mode [27] has not been studied in depth. Efficient query model under off-chain storage will be explored in the future. The issue of security degradation: After sharding, the number of nodes in single shard decreases, resulting in a degradation of security level. How to improve the intra-shard security will be one of the future studies.
Footnotes
Acknowledgments
The research was financially supported by National Natural Science Foundation of China (No. 62001262), National Key R&D Program of China (No. 2020YFB1711903), the Key Research and Development Program of Shandong Province (No. 2019GGX101017).
