Abstract
Healthcare data is growing rapidly and data sharing is becoming increasingly important among hospitals. However, traditional methods are prone to data leakage and threaten patient privacy. To ensure data security and realize sharing and access control, combining blockchain technology, attribute-based encryption and ciphertext policy (CP-ABE) and AES symmetric encryption algorithm, this paper proposes a hierarchical attribute-based encryption algorithm based on master-sidechain (MSC-CP-ABE) and a collaborative access control and sharing model with on-chain and off-chain (AC-CSS). The algorithm encrypts hierarchically based on data sensitivity, the ciphertext is stored in the interplanetary file system (IPFS), and the encrypted address and key are stored in the side chain, which is convenient for indexing in the main chain. The vectorized access policy implements policy hiding, combines on-chain and off-chain, saves blockchain space, achieves hierarchical access control, and improves security and efficiency. Simulation experiments show that the main sidechain model proposed in this paper has better performance in terms of transaction delay and throughput. Compared with existing models, the algorithm proposed in this paper has certain advantages in data encryption and decryption time and key generation time.
Introduction
As electronic healthcare data [1, 2] becomes more commonplace, hospitals tend to centralize the management of this data in private databases in order to protect the security of patient information and business interests. However, this centralized approach to data management limits the ability to analyze complex diseases and increases the risk of a single point of attack on the system. In addition, this approach restricts patients’ access to their own medical data, making it difficult to provide a complete medical history at the time of referral and potentially compromising treatment outcomes. Insurance companies and research organizations also face difficulties in accessing and using medical data without violating privacy.
Access control [3, 4] is key to protecting medical privacy, but centralized management is prone to systematic failures. Traditional access control that relies on trusted third parties is both expensive and potentially unfair. Designing secure and fair systems is an urgent challenge.
Satoshi Nakamoto’s 2008 white paper on Bitcoin [5] ignited interest in blockchain technology, a decentralized and secure distributed ledger technology [6]. Blockchain has expanded into several domains, including the Internet of Things [7], healthcare [8], smart cities [9], and the metaverse [10]. Ripple Labs [11] introduced the crossledger protocol in 2012, facilitating interconnections between payment systems. In 2014, Blockstream introduced wedge-shaped sidechain technology [12], enhancing the efficiency and security of asset transfers. In conjunction with access control and cross-chain technology, blockchain enables a secure distributed access control system that ensures the safe exchange of data and prevents leakage.
CP-ABE (Ciphertext-Policy Attribute-Based Encryption) [13] is an attribute-based encryption scheme that allows fine-grained attribute-based access control. The data owner sets policies to manage who can decrypt the data. IPFS (Interplanetary File System) [14] is a distributed file system that retrieves files by content ID rather than location, ensuring high availability and content invariance. In order to solve the problems in healthcare data sharing, such as dependence on third parties, storage centralization, inefficiency and privacy issues, a model combining master-sidechain technology and hierarchical attribute-based encryption (MSC-CP-ABE) is proposed, aiming to improve the access efficiency and security of healthcare data sharing. The main research work of this paper is as follows:
1. An on-chain-off-chain collaborative hierarchical access control and sharing model (AC-CSS) for medical data is proposed to address fine-grained access and attribute encryption in medical data sharing
2. In the proposed MSC-CP-ABE algorithm, medical data is graded according to sensitivity, high and medium sensitivity data is encrypted with AES attributes and low sensitivity data is encrypted with direct attributes to achieve graded access control.
3. Combining on-chain and off-chain storage to solve blockchain storage problems, the main sidechain includes one main chain and three sidechains, storing ciphertext addresses and keys according to data sensitivity. Using homomorphic RSA to protect attribute privacy, P-TL smart contract verifies the timeliness and correctness of transactions on the side chains.
Related work
Blockchain related applications
In recent years, blockchain’s tamper-proof nature has led to its widespread use in healthcare data sharing. Combined with artificial intelligence [15], cloud computing [16], and big data [17], blockchain has revolutionized the construction of scalable IT systems and facilitated healthcare data sharing.
Liao et al. [18] developed a blockchain-based medical image segmentation framework that enhances model generalization. Shen [19] proposed a decentralized blockchain federated learning approach to accurately predict traffic flow. Jiang [20] used blockchain to issue e-vouchers to help small and medium-sized enterprises (SMEs) to raise funds, but lacked access control. Bhaskar [26] used blockchain to connect electric vehicles with sensors to ensure secure energy transactions. Literature [21] combines cloud mining and blockchain technology to optimize computing resources. Literature [22] constructed a blockchain-based cloud payment framework.Xiong et al. [23] shifted intensive computation to cloud storage for efficiency, but at the expense of decentralization. Literature [24] designed a privacy protection mechanism, which has the risk of leakage despite improving efficiency. Cheng et al. [25] proposed a cross-chain consensus mechanism, which is low-cost and scalable but difficult to verify datasecurity.
The above mentioned techniques in the literature such as federated learning, cloud computing and cross-chain modeling improve the efficiency of data sharing, but access control and encryption measures have not yet been fully implemented, posing the risk of poisoning attacks, single point of failure and privacy breaches that threaten the security of healthcare data.
Symbols in this paper
Symbols in this paper
Access control is one of the essential methods of privacy protection, which can avoid the leakage of medical information; in this section, some related work in the area of access control is presented.
Access control is crucial in the field of healthcare big data. Jiang et al. [27] proposed a spectral clustering and risk-based access control model (SC-RBAC) for healthcare big data scenarios, and a risk quantification and usage control-based access control model (RQ-UCON) [28] to enhance the privacy protection of healthcare data, but there is a risk of information leakage. Smart contract is a self-validating and executing transaction protocol [29], and Xie et al. [30] proposed a traceable access control mechanism based on blockchain. Lin et al. [31] proposed a secure mutual recognition authorization system based on blockchain, and Wang et al. [32] proposed a new pairing-less and certificate-less scheme. In recent years, researches have used blockchain to establish attribute-based access control systems, e.g., Li et al. [33] proposed an access control model for IoT, Gao et al. [34] implemented secure access control through blockchain and proposed secure cryptographic policies and attribute-hiding access control schemes based on blockchain. Li [35] et al. proposed a new no-pairing and no-certificates scheme and a new pairing and certificate-free scheme based on smart contract and homomorphic cryptosystem pairless and certificate-less schemes to realize the dynamic permission change function. He et al. [36] designed a cross-chain electronic medical record system, while an intuitive fuzzy trust access control model was proposed in the literature [37] to realize adaptive dynamic accesscontrol.
Although the adoption of blockchain, attribute encryption, and smart contracts can enhance the security of data sharing, these solutions still face privacy breach risks, low intelligence, and scalability issues. In addition, healthcare data on-chain is subject to storage constraints, which affects storage and sharing efficiency and limits scalability.
Prerequisite knowledge
Bilinear mapping
1. Bilinear:
2. Non-degeneracy:
3. Computability:
RSA cryptosystems with homomorphic encryption properties
The RSA [39] cryptosystem is a public-key cryptosystem with multiplicative homomorphic properties, described as follows.
1. Key generation phase: Choose two unequal prime numbers, p and q, at random and compute n = p × q, φ (n) = (p - 1) × (q - 1)).Choose an integer c such that gcd(φ (n) , c) = 1, 1 < c < φ (n).Compute d such that d ∗ e ≡ 1modφ (n).Get public key k pub ={ n, c }, private key k msk ={ n, d }.
2. Encryption phase: First of all, the plaintext is grouped into bit strings so that the corresponding decimal number of each group is less than n, and then DO one encryption for each group m in turn, The sequence formed by the ciphertexts of all packets is the encryption result of the original message, i.e., m satisfies 0 ≤ m < n, The encrypted ciphertext CT R is: CT R = R (m) ≡ m c (mod n),Where 0 ≤ CT R < n.
3. Decryption phase: for CT
R
, the decryption algorithm is: m = D (CT
R
) ≡ CT
R
d
(mod n). In fact,R (m) is homomorphic in terms of multiplication, i.e.
This section describes the AC-CSS architecture, interaction flow, healthcare data sharing construction, master-side blockchain interaction and P-TL smart contract.
AC-CSS system model
As shown in Fig. 1, the AC-CSS model proposed in this paper consists of five entities: attribute authorization center, data owner, data user, star file system, and master side chain.

CP-ABE Access Control Tree Structure.
1.
2.
3.
4.
5.
In Fig. 2, the AC-CSS model transaction flow is divided into four phases:

AC-CSS transaction flow.
The Proof is an identifier that ensures that the hidden access policy matches the attribute and is used to validate the access rights. The DU locally matches the policy by checking for
Medical data are categorized into high, medium and low levels according to the Medical Data Security Guidelines. High sensitivity data such as name and phone number have a high impact after leakage; medium sensitivity such as age and region still have medical value after blurring; and low sensitivity is other medical data. In this paper, access control classifies data based on user identity, specialty and trust level.
The article defines three levels of identity, specialty, and trust levels, i.e., n = 3, with the high level encompassing the low level of access. As an example, high, medium and low sensitive data M High , M Medium , M Low are identified with corresponding encryption policy attributes:
The low-sensitive access policy for:
All the identity levels of

Schematic diagram of access control tree transformation.
The medium-sensitive access policy for:
That is to say, all the people with identity levels
The highly-sensitive access policy for:
That is to say, only the personnel whose identity level is
The MSC-CP-ABE access control algorithm consists of four phases: initialization phase, encrypted storage phase, on-chain phase, and decryption phase, which includes four algorithms:
1.
Generate global public key: PK global = (G, g).
The attribute authorization center randomly selects β, γ ∈ Z
N
to generate the master key and public key: MSK = (g
β
, γ),
Finally, the public key of the system is obtained as:
2.
Generate access control tree: same as
Recursively compute the access control tree: for each node x choose a polynomial q x ,the degree d x of the polynomial q x is 1 less than the threshold k x of that node, i.e., d x = k x - 1. Select the random number s High ∈ Z n starting from the root node x High , generate a polynomial for the root node as q High , set q High (0) = s High , generate a polynomial for the left child node of the root nodex Medium generate polynomial to q Meduim , set q Meduim (0) = s Meduim = q High (index (x Medium )), for the left child node of the root node x Low generate polynomial to q Low , set q Low (0) = s Low = q Medium (index (x Low )), for the other nodes x, select the polynomial q x , such that q x (0) = qparent(x) (index (x)), parent (x) is the parent node of node x, and d x points are randomly selected to define the q x .
AES symmetric encryption: randomly select
To reduce the storage overhead of the blockchain, the ciphertext C is stored in IPFS, and then IPFS returns the ciphertext download link u.
To realize policy hiding embed the access structure δ vectorized representation of
Compute the ciphertext: Let the set of all leaf nodes in the access structure δ be Y, then the ciphertexts of plaintexts M
High
, M
Medium
, M
Low
under the access structure δ are:
3.
The storage transaction Tx sto contains the sign S, the ciphertext address u, the symmetric key m, the integrity check code checkCode and the signature sign. The checkCode is the hash value of the ciphertext C and is used to verify the integrity. The message digest MD is computed by H (S, u, m, checkCode) and sign is the digital signature of MD by the DO private key BSK DO . Any DU can check the integrity of the ciphertext with checkCode and verify the origin of the transaction with sign.
After the generation of Tx sto , it is broadcasted to other nodes of MSC to verify the validity of the transaction through signature verification and verify the ciphertext and symmetric decryption key through checkCode. The details are shown in Algorithm 1.
The validity of Tx sto is verified by comparing the message digest MD′ with the MD obtained by decrypting the signature with BPK DO . If MD′ = MD, it is considered valid.DO sends ciphertext address u and symmetric key m to MSC to save storage space. To ensure the integrity of the ciphertext, checkCode′ is checked for equality with the checkCode in Tx sto . After verification, it is packed into blocks for PBFT consensus.
To obtain the key from DO and ensure the privacy of the attributes, DU generates
Where T represents the time of Proof generation, BPK
DU
denotes the public key registered by DU in AA, u denotes the ciphertext storage location to be accessed by DU,
1: MD′ = H (S, u, m, checkCode)//Calculate the message digest of the transaction
2: MD = Compute BPK DO (sign)//Signature verification with BPK DO public key
3:
4: Obtain the CT according to the u i
5: checkCode′ = H (CT)// Calculate the message digest of the CT
6:
7:
8: end
9: end
10:
After verifying the validity of the Proof′s signature, an access transaction can be generated for a DU that satisfies the hidden access policy defined by the DO. In addition, to ensure that any DU with a valid signature proof does have access rights, it broadcasts the proof to other nodes in the MSC and triggers each smart contract to verify whether the DU has access rights. The verification process and its smart contract setup are described in the next section.
After that, DO generates an access transaction for a valid Proof. DO calculates the message summary of the transaction MD = H (A, BPK
DU
, u, T
stemp
) according to the current time T
stemp
, DO verifies the MD with signature sign = Sign
BSK
DO
(MD), the smart contract verifies the access and generates the access transaction.
Where A is used to identify the access transaction, T stemp denotes the time when the Tx acc was generated, and sign is used to prove that the Tx acc was indeed sent by the DO. Note that the publisher of the Tx acc needs to be the same as the u owner of the data stored in the Tx acc .
4.
DO choose random number x ∈ Z N , compute g x and send (S, K = g x ) to the attribute authorization center.
The attribute authorization center selects a random r ∈ Z
N
, computes D′ = K(β+r)/γ = gx•(β+r)/γ, and sends D′ to the user. For each attribute j ∈ S pick a randomr
j
∈ Z
N
. Then compute the key SK1 and return SK1 to the user.
The user calculates D = (D′) 1/x = g(β+r)γ:
Recursive computation of access control tree: same as
Compute plaintext:
Compute highly-sensitive plaintexts:
Similarly, compute medium-sensitive plaintexts:
Compute low-sensitive ciphertexts:
As shown in Fig. 5, the traditional blockchain single-chain model faces problems such as insufficient storage space, long transaction confirmation time, and reduced efficiency. For this reason, this paper proposes the main side chain blockchain model, which consists of one Ethernet main chain and three Ethernet side chains to enhance storage space and performance. The main chain stores the basic information of the side chains, such as location and name, without privacy. The sidechains, on the other hand, store addresses and symmetric keys for different sensitivity information. P-TL contracts are deployed on the three sidechains. When a main chain node broadcasts the Proof of a DU to a side chain, the Proof is automatically used as an input to the contract, and rules are enforced to verify the DU access rights.

Master-side chain interaction.

P-TL smart contract composition.
The P-TL contract consists of PVC and real-time monitoring; PVC limits the policy lifecycle, ensures that keys are only obtained through Proof during the permitted time, and real-time monitoring blocks malicious behaviors.PVC restricts visitors by categorizing the access time and maintains transaction logs; real-time monitoring maintains both the logs and the violation list. Figure 5 shows the P-TL structure.
Example of ciphertext key-related information
Example of a log list
Example of a log list
1:
2:
3: R (x i ) = m i (mod n)// The RSA cryptosystem encrypts each element
4: end
5:
6:
7:
8: result = R (u1) • R (x1) + R (u2) • R (x2) + ⋯ + R (u n ) • R (x n )// Multiply the encrypted vectors
9:
10: T, u, m ← getinfo (Proof )// Get partial elements in Proof
11:
12:
13:
14:
15: end
16:
17:
18:
19: end
20: end
21: end
22:
1: APR ← makeRecord (Proof)// Generate access logs APR
2: APR = T ∥ BPK DU ∥ u ∥ m //" ∥" denotes the bit connective,A ∥ B denotes AB
3: write APR into Log list table
4:
5: T ← getinfo (Proof)// Get Proof generation time
6:
7: Determine the sensitivity of the data
8:
9: mark Proof∗//∗ Representing a signal of violation
10: VIPR ← makeVIRecode (Proof)// Generate a record of violations VIPR
11: VIPR = T ∥ BPK DU ∥ u
12: write VIPR into Violation list table
13:
14:
15:
This section analyzes the security of our system and the security of the MSC-CP-ABE algorithm.
System security
The system proposed in this paper ensures that healthcare data is secure at every stage of the lifecycle,
1.
2.
3.
4.
In conclusion, this paper has verified the security of the system and recognizes that data security can be ensured at all stages including
MSC-CP-ABE algorithm security
Betancourt et al. [13] ensured the security of CP-ABE under the DBDH assumption. The security of AES symmetric encryption standard adopted by the US federal government is recognized. In this paper, the CP-ABE based MSC-CP-ABE algorithm incorporates AES to provide layered encryption for data of different sensitivities with no lesser security than CP-ABE and AES, the proof of which is given below.
1.
2.
3.
4.
The MSC-CP-ABE algorithm is at least as secure as CP-ABE at four steps. It is no less secure than AES by encrypting (decrypting) plaintext using AES symmetric keys.The vectorized access policy further enhances the security of MSC-CP-ABE.
Performance analysis and simulation experiment analysis
In this section, we will analyze the model functionality, test and analyze the sidechain interaction performance, and compare the performance of the proposed algorithms and systems.
Functional analysis
Table 5 compares the performance of this paper’s scheme with literature [13, 41], in terms of hierarchical access control. Literature [13, 40] only supports a single policy, which increases the computation and storage cost; literature [35] does not realize flexible hierarchical access control; and literature [41] lacks policy hiding, which is a security risk. In contrast, this paper realizes the fusion of blockchain and access proof, while providing hierarchical access control and policy hidingfunctions.
Function Comparison
Function Comparison
In this paper, the proposed MSC-CP-ABE access control algorithm is compared with the literature [13, 41]. Literature [13] provides the basic CP-ABE algorithm, literature [35] proposes the OHP-CP-ABE optimization scheme, while literature [41] introduces the CP-ABE encryption scheme with multiple authorization centers. Table 6 shows the results of comparing the ciphertext lengths for high, medium and low sensitivity data.
Comparison of storage overhead
Comparison of storage overhead
Literature [13] and [35] do not support multi-level access control, which leads to an increase in the length of ciphertexts with different sensitivities that need to be stored separately; literature [41] makes the system public key and user key larger due to the multi-attribute authorization center. In contrast, this paper adopts single-attribute authorization center to maintain the stability of public key and key length, and is the shortest in ciphertext length, system public key and user key.
The experimental data comes from the cooperative unit - Kunming City, a tertiary-level A hospital and a county people’s hospital in Yunnan Province, which contains a total of 1,200G of text, image, and imaging data, involving 1,360 data tables and 2,139,373 records. Doctor access logs are used in the experiment to simulate doctor access behavior.
The experiments were performed using the Java implementation of the JPBC library on a machine configured with the Ubuntu Server 20.04.5 LTS operating system, Intel(R) Core(TM) i5-10400F CPU, and 16GB RAM. All concurrent tests were performed on Hyperledger Fabric 2.3.1, a federated blockchain with a PBFT consensus mechanism. The article analyzes the main sidechain performance, examines the key generation and encryption/decryption times for different attribute algorithms, and evaluates them with comparative experiments.
Master-side chain performance testing and analysis
This paper uses JMeter to test the performance of Inquire and Transaction transactions, as well as the sidechain performance of a healthcare data sharing system. Inquire is used for data querying, while Transaction is used for data transactions. The latter involves multiple nodes and requires more nodes to participate in the transaction sequencing and packing, thus resulting in a lower throughput than Inquire for the same level of concurrency of the system. Inquire is used for data querying and Transaction is used for data transactions. its latency is higher than Inquire and its throughput is lower.
As shown in Figs. 6 and 7.The Inquire and Transaction transactions were tested in concurrency experiments with 100-500 and 50-250 runs, respectively; the Inquire transaction had a maximum average latency of 2.9s and a maximum throughput of 215 transactions per second; the Transaction transaction had a maximum average latency of 8.1s and a maximum throughput of 120 transactions per second. As shown in Fig. 8. Sidechaining meets acceptable standards in concurrency experiments (50-450 times) with a maximum average latency of 0.8s and a maximum throughput of 276 transactions per second.

Latency and throughput of Inquire transactions

Latency and throughput of Transaction transactions.

Transaction latency and throughput of side chains.
Comparing the literature [13] and [41], the effect of the number of attributes on the key generation and encryption/decryption time is analyzed, and the number of attributes included in the selected access policy are 2, 4, 6, 8, 10, 12, 14, 16. The experiments show that with the increase of the user attributes, the key generation and encryption/decryption time grows as shown in Fig. 9. The key generation time of this paper’s scheme is between the two literatures, and the decryption time of this paper’s scheme is the shortest for a large number of attributes, which makes this algorithm more preferable considering that the data is often encrypted once and decrypted many times.

Effect of number of attributes on key generation time.

Effect of number of attributes on encryption and decryption time.
To solve the problem of medical data sharing, this paper proposes an access control model AC-CSS based on hierarchical attribute encryption in the main sidechain, combining CP-ABE and AES algorithms to realize multilevel access control MSC-CP-ABE.Fine-grained access control and policy hiding are realized by ciphertext policy hiding and sensitivity classification. The RSA homomorphic encryption is utilized to protect user attribute privacy, and the P-TL smart contract is validated in the side-chain verification Proof. experiments verify the feasibility and security of the model. Nevertheless, there are problems such as long key generation time, difficulty in request identification under high concurrency, and single point of failure that may be triggered by third-party generation of master key, which will be the focus of future research.
Footnotes
Acknowledgements
This work was supported by the National Natural Science Foundation of China (No. 72471206, 71972165, 61763048, 72164037), Key Projects of Basic Research for Science and Technology Foundation of Yunnan Province (No. 202001AS070031), the Central Government’s Special Program for Guiding Local Science and Technology Development (No. 202307AB110009), Science and Technology Foundation of Yunnan Province Education Department (No. 2023J0657).
