Abstract
The distribution of personal health records (PHRs) via a cloud server is a promising platform as it reduces the cost of data maintenance. Nevertheless, the cloud server is semi-trusted and can expose the patients’ PHRs to unauthorized third parties for financial gains or compromise the query result. Therefore, ensuring the integrity of the query results and privacy of PHRs as well as realizing fine-grained access control are critical key issues when PHRs are shared via cloud computing. Hence, we propose new personal health records sharing scheme with verifiable data integrity based on B+ tree data structure and attribute-based signcryption scheme to achieve data privacy, query result integrity, unforgeability, blind keyword search, and fine-grained access control.
Keywords
Introduction
With the recent advancement in computing technology, patients are increasingly getting personal health records (PHRs) via wearable body sensors, diagnosis of diseases, treatments, etc. PHRs are the collection of health information that helps a patient to monitor and track his or her health status [44]. PHRs may consist of information about allergies, immunizations, illnesses, surgeries, body vital signs such as glucose level, heartbeat rhythm, and results of physical exams and screenings. The PHRs can save the patient’s time and money by avoiding unnecessary repetition of medical tests and providing an up-to-date, accurate, and full medical history for efficient patient diagnosis and treatment. The PHRs are also valuable resources that can positively transform the health sector, but if they are not shared with health research organizations, the full positive impact of them cannot be realized. Shared PHRs can boost the health sector through their usage in training machine learning algorithms, early detection of disease outbreaks, disease prevention, etc.

Health data sharing scenario.
For example, even a fairly healthy person (patient) may require the services of a primary care physician for overall health, an optometrist for eyeglasses, a cardiologist for a minor heart murmur, a podiatrist for a fractured arm condition, and a dentist for dental care. All other specialists require basic information collected by the primary care physician, such as vital signs. The podiatrist and dentist would also like to know about the medical prescription of the cardiologist since heart diseases could cause arm or dental health issues. So, patients who share their personal health records (PHRs) with their physicians can greatly benefit in terms of disease diagnosis and medication prescriptions. As shown in Fig. 1, a patient (data owner Ensuring that only authorized users have access to sensitive PHR data. Providing fine-grained access control to the PHRs data among multi-users with a different scope of access privileges. Preventing the forgery of PHRs data saved on the cloud server. Performing a secure search on the encrypted PHRs data without leaking search keywords. Performing remote verifiable PHRs database updates. Guaranteeing the integrity of the query result (i.e., completeness, freshness, and correctness).
The leading cause of the issues mentioned above is that the data owner loses control of his/her data stored on the cloud. So, if a PHR falls into the wrong hands, it can be misused, falsified, and publicly disclosed, thereby putting the data owner’s identity at risk. A study in [36] shows high exploitation of PHRs for many reasons, including identity thefts and financial data crimes. Therefore, ensuring a secure querying on encrypted data, integrity of the query results and privacy of PHRs as well as realizing fine-grained access control are critical key issues when PHRs are shared via cloud computing.
Recent research works [23,39,43,49] on cloud-based PHRs sharing focus on fine-grained access control, data searchability, authenticity, and privacy; but do not consider query results integrity and remote verifiable database update. In practice, it is prudent to design a scheme that simultaneously tackles the issues to . Nevertheless, to the best of the authors’ knowledge, there are no PHRs schemes that concurrently address all the issues outlined above. Consequently, we proposed a new PHRs sharing scheme based on attribute-based signcryption and B+ tree to simultaneously addresses all the issues listed above. Our proposed scheme provides unforgeability of ciphertext, fine-grained access control, the integrity of query results, encrypted data retrieval, and verifiable remote database modification. In summary, our contributions are:
We construct an attribute-based signcryption scheme to facilitate the secure exchange of PHRs through a cloud server. Our scheme ensures fine-grained access control and public verifiability of the integrity of the PHRs.
We adopt an encrypted B+ tree with embedded Merkel-like digest in each node of the B+ tree to efficiently retrieve data from the cloud server, perform remote verifiable PHRs database update and verify the query result integrity respective to the correctness, freshness, and completeness.
We design a search mechanism to enable blind search on the B+ tree without revealing the content of the search key and the query key (keyword). The proposed search mechanism supports exact query and range query on the B+ tree.
We also compared our proposed construction with the existing ABSC schemes concerning storage and computation cost. The results from our analysis demonstrate that our scheme has better performance than others.
Section 2 provides a review of related works on PHRs data security, searching on encrypted data, and verification of the integrity of query the result. The important concepts and theories applied in our work are presented in Section 3. In Section 4, the detailed design of the proposed scheme is presented. Section 5 presents the security proof of the ABSC scheme. Section 6 shows the performance analysis of the proposed PHRs sharing scheme. Finally, Section 7 presents the conclusion of our work.
Related works
This section briefly reviews the relevant research on PHRs data security, searching on encrypted data, and verifying the integrity of query results.
PHRs data security
To prevent unauthorized access and exposure of PHR data, the PHRs must first be encrypted offline before being outsourced to the cloud. While traditional encryption schemes [11,16,27,40] or identity based encryption (IBE) schemes [7–9,25] can ensure confidentiality of PHRs, they are limited to “one-to-one” mapping, which implies that only a single user can access the data. The flexibility and efficiency are limited. Additionally, these schemes are not desirable in the context of sharing PHRs as they require the
Since the introduction of the concept of ABE by Sahai and Waters [37], many ABE schemes [17,29,46,51] have been designed for cloud-based sharing of data. ABE scheme comes with two flavors: key-policy ABE (KP-ABE) and ciphertext-policy ABE (CP-ABE). In KP-ABE, access policies are associated with private keys, and ciphertexts are associated with attribute sets. A CP-ABE is a dual variant of KP-ABE for which an attribute set is assigned to a private key while an access policy is associated with a ciphertext. However, the ABE schemes only ensure data confidentiality and cannot prevent the forgery of a ciphertext.
Digital signature and encryption can provide data authenticity and confidentiality, respectively. To obtain the advantages of using these two mechanisms at the same time, the traditional method is “encrypt-then-sign” or “sign-then-encrypt”. To minimize the total computation overhead and communication cost associated with these two approaches, Zheng [57] presented a new cryptographic primitive called signcryption. Signcryption provides a mechanism to simultaneously sign and encrypt a message in parallel with a minimal computation overhead and communication cost. For instance, Zheng [57] applied signcryption approach on “Schnorr signature + ElGamal encryption” to saves two modular exponentiations, which yields a significant reduction in computation cost. Subsequently, Gagńe et al. [13] presented the first attribute-based signcryption (ABSC). The ABSC scheme simultaneously combines attribute-based encryption and attribute-based signature schemes, providing many useful security properties such as unforgeability, confidentiality, signer’s privacy, and fine-grained access control of the encrypted message. It also has less computational overhead and storage cost when compared with the traditional encryption-after-sign scheme [54].
Since the creation of ABSC, several sharing schemes based on cloud computing have been proposed. Liu et al. [21] proposed a ciphertext-policy ABSC (CP-ABSC) scheme to secure medical data and claimed that their scheme guarantees data confidentiality, authenticity, unforgeability, and anonymity. Unfortunately, Rao [33] demonstrated that [21] does not attain data confidentiality and public verification of the authenticity of a ciphertext. Au et al. [3] presented a general scheme for secure sharing of PHR, and the scheme ensures that a doctor can provide their patients’ health records to specialists for research purposes. Liu et al. [22] proposed a secure online/offline attribute-based signature scheme for sharing Mobile Health Records. Though many attribute-based signcryption schemes [23,34,35,48] have been proposed, they have high computation and communication cost. Particularly, ciphertext size and computation cost associated with message unsigncryption depend linearly on the number of attributes under which a message was signcrypted. This will limit the application of CP-ABSC in cloud computing if the number of attributes is too large and the length of ciphertext is too long. Hence, we construct a new CP-ABSC scheme with a constant ciphertext size and a constant value of bilinear pairing operation that reduces the computation cost and communication overhead.
Searching on encrypted data
Encryption is an efficient technique for securing sensitive data. But, the performance of querying on a database degrades when the data is stored as ciphertext. Searchable encryption is a technique that enables a user to publish a symmetrically encrypted dataset to an untrusted server and later perform a keyword search on the encrypted dataset without exposing the content of the encrypted dataset and search information. Song et al. [41] presented the first searchable symmetric encryption scheme that enables the
The searchable encryption scheme, which supports range queries with operators like “=, <, >, ⩽, ⩾, ≠” is powerful and allows the result to be filtered precisely according to the condition of a user’s query keyword, e.g. Select * from record R where Age > 18. This work develops a searchable mechanism based on a B+ tree to replicate both exact and range queries search capabilities of plaintexts on encrypted messages. The mechanism enables blind search on the encrypted B+ tree without revealing the content of the B+ tree and the query keywords to the
Verification of integrity of the query result
Another critical issue is how to verify the integrity (i.e. completeness, freshness, and correctness) of the query result returned by the
To address the query results’ integrity issue, several verifiable query processing techniques have been proposed (e.g., [19,31]). There are mainly two approaches to validating query results’ integrity: authenticated data structure (ADS) and circuit-based verifiable computation (VC). The VC-based approach (e.g., SNARKs [32]) can support arbitrary computing but has very high and often impracticable overhead costs. Also, it requires a costly pre-processing step as both the proving key and the verification key have to be hardcoded into the database software. To tackle this problem, the authors in [5] proposed another variant of SNARKs in which the pre-processing step is based on only the upper-bound size of the database and query program. Zhang et al. [56] presented a vSQL system that allows verifiable SQL queries via an interactive protocol. Nevertheless, it is restricted to relational databases with a fixed schema. In contrast, the ADS-based solution is generally more efficient as it tailors to a specific query. Therefore, our proposed solution is based on this approach. Digital signature and MHT are two types of structures widely used as ADS. Digital signatures authenticate a digital message’s content based on asymmetric cryptography. To allow verifiable queries, all data records must be signed and hence unable to scale up to large datasets [31]. On the other hand, MHT is constructed on a hierarchical tree [26]. A hash digest of a data record is assigned to each entry in a leaf node and a digest from the children’s nodes is also assigned to each entry in the internal node. To validate any data subset, the
Preliminaries
(Lagrange Interpolation).
Let S be a set with size
(Predicate).
Let U denotes a universal attribute. A predicate (over U) is a monotone boolean function whose inputs are associated with an attribute set
Bilinear maps
Let Bilinearity: For all Non-degeneracy: Computability: There is an efficient algorithm to compute
Note that the map e is symmetric since
Hardness assumption
The security of our construction is based on the q-Diffie-Hellman Exponentiation (q-DHE) assumption [14] and Decisional q-Diffie-Hellman Bilinear Exponentiation (q-DBDHE) assumption [7]. We say the Let As presented in [57], CP-ABSC consists of four basic polynomial time algorithms wherein We say that CP-ABSC construction is correct if for all messages (q-DHE Assumption).
(q-DBDHE Assumption).
(Ciphertext policy attribute based signcryption (CP-ABSC)).

The security definitions of message confidentiality and ciphertext unforgeability for CP-ABSC are defined as follows.
Message confidentiality
The security notion of message confidentiality is defined on indistinguishability of ciphertexts under adaptive chosen ciphertext attack in the selective encryption predicate model (IND-ABSC-sCCA) through an interactive game between an adversary
sExtract oracle
dExtract oracle
Signcrypt oracle
Unsigncrypt oracle
A CP-ABSC scheme is said to be
This security notion is defined on existential unforgeability under adaptive chosen message attack in the selective predicate model (EUF-ABSC-sCMA), through following game between a challenger
sExtract oracle
dExtract oracle
Signcrypt oracle
Unsigncrypt oracle
The advantage of A CP-ABSC scheme is said to be
The B+ tree is a tree-like index structure in which every node is a disk block. The B+ tree remains balance: all paths from the root to the leaf node have the same depth [55]. Every node in B+ tree has between

System architecture.
In this part, the system architecture of our scheme, the function of each entity in the scheme, design objectives, and the construction of a cloud-based PHRs sharing scheme are presented.
The scheme entities
To share PHRs data among untrusted parties via the cloud, we design an access control scheme. The system architecture is shown in Fig. 2.
Private key generator
The private key generator generates a secret key and public keys for the system users based on their attributes.
Cloud server
Data generated by the patients are indexed, encrypted, and outsourced to the
Data owner
The patient (Data owner) usually encrypts PHRs with other details and uploads them to the
Data users
These are the individuals who have been authorized by the PKG to access the patient’s medical data from the cloud for usage.
Design objectives
The attribute based signcryption scheme CP-ABSC [21,30,54] provides effective mechanism to signcrypt (encrypt-then-sign) and unsigncrypt (decrypt-then verify). Thus, it addresses the problems of privacy leak, forgery and ensures fine grained access control and public verification of the authenticity of the plaintext. The application of the CP-ABSC scheme introduces another set of challenges. Firstly, there can be a large set of PHRs dataset such as x-ray scan which has to be encrypted and outsource to the cloud server. But all asymmetric encryption schemes have high computation overhead which includes our signcryption scheme. To tackle this problem, the encryptor can use symmetric encryption such as AES [27] to encrypt the PHRs data and then encrypts the secret key with the CP-ABSC scheme.
Secondly, searching on the encrypted data in the
In addition, the
Finally, to modify (delete and update operations) the outsourced database on the cloud server, the
Construction of cloud-based PHRs sharing scheme
To solve the problem of ciphertext size that scales linearly with the number of times an attribute is used in general CP-ABSC and to minimize computation cost we propose a CP-ABSC scheme that supports a flexible threshold predicate for signing and encrypting messages. As shown in Fig. 3, the proposed scheme is categorized into four main phases. The first phase is registration which consists of step to step . The second phase is data outsourcing and consists of step to step . The third phase is data obtainment, consisting of step to step . The last phase is provable database modification and it consists of step to step .

The process flow of the system architecture.
To have access to PHR’s system, users must go through the steps of registering their credential with the PKG. To initialize the system, the PKG runs the

Generation of signature key. When a new data owner

Generation of decryption key. To have access to PHR’s system, data users

To simultaneously achieve unforgeability of ciphertext, fine-grained access control, the integrity of query results, encrypted data retrieval, and provable remote database modification, the data owner will perform the following steps before uploading PHRs to the cloud server.
Step 1 (Encryption of PHR data). The complexity of the CP-ABSC algorithm makes it not suitable to encrypt large PHR data. Therefore, the

Example of encrypted B+ index structure.
Step 2 (Construction of authenticated data structure (ADS)). To reduce the number of disk access operations when processing a query over encrypted data, the

Example of authenticated B+ data structure.
Step 3 (Construction of search mechanism). To preserve the privacy of the query keyword while providing the ability of the

Storing of comparison matrix value in hash table

Comparison matrix privacy operation
Step 4 (Creation of manifest file). The
Finally, the

PHRs data request. First, a

PHRs data retrieval. When the

Query over encrypted B+ tree [exact-match queries]

Query over encrypted B+ tree [range queries]
Data integrity verification with usage. Since the

Verification of the query results
Data insertion. Let’s suppose that the

Data insertion
Data deletion. Let’s suppose that the
To guarantee that the

Data delete
Confidentiality
Our CP-ABSC scheme is
Assuming that there is an adversary
Sample
Set
For each
For each attribute
The challenger
sExtract oracle For every attribute dExtract oracle For each For each
The most crucial part is to simulate the Signcrypt oracle Unsigncrypt oracle If
Assuming that the signing attribute universe
Suppose there is an adversary
Due to page restrictions, the details of the security proof are omitted. Suffice to say that, unforgeability results for our scheme can easily be proved using the techniques of [25,34].
In this section we evaluate the efficiency of the proposed schemes in terms of communication overhead, computation cost, memory requirement, and make a comparison with other schemes.
Complexity analysis
Table 1 provides a complexity analysis of operations in our proposed with the relevant schemes [21,23,33] on PHRs sharing. As shown in the table, the size of the ciphertext of the schemes [21,33] and [23] grows linear with the number of attributes used in signcryption of a message and so is the number of pairing computations, which makes the scheme less practical. The ciphertext in [21,33] and [23] consists of
As the pairing computation is regarded as the most expensive operation, the practicality of an ABSC construction becomes a concern if the number of pairing operations required to unsigncrypt a ciphertext is high. However, our construction requires 4 pairings to recover the plaintext message and only 3 number of pairings for ciphertext verification. To sum up, 7 constant number of bilinear pairings are needed to unsigncrypt ciphertext which is not dependent on the length of decryption and signing attributes. Hence our unsigncryption algorithm is efficient from a computation point of view. On the contrary, the number of pairings in [21,23,33] are linear to the sum of underlying decryption and signing attributes (see Table 1 for comparison).
The comparison of computational complexity
The comparison of computational complexity
Legends: We omit the message size from the ciphertext size in all the schemes listed in the table.
In this sub-section, we will focus on the simulation and performance analysis of our PHRs cloud based sharing scheme. All the simulations are executed on Intel i7 personal laptop with a 2.2 GHz CPU and 8 GB RAM running on macOS High Sierra 10.13.6. We used SS512 elliptic curve with a 512-bit base field in the python cryptographic library known as charm-crypto 0.43 [2]. In the simulation, the message m was chosen from
Key size
In this sub-section, we analyze the relationship between the number of attributes with the public key and the private keys size. In the algorithm
Ciphertext size
As stated previously, the information stored on the cloud server comprises the encrypted B+ tree along with the encrypted PHRs, signcrypted manifest file, and the hashmap. We mainly consider the size of the ciphertext of the manifest file. For simplicity, here the manifest file is denoted as m. The length of the ciphertext is
The computation time
We evaluated the computation time of running the algorithms: Setup, sExtract (here, we omit dExtract algorithm since it has the same computational complexity as sExtact), Signcryption, and Unsigncryption. The execution times are based on the average of 100 trials. The experimental results for Setup, sExtract, Signcryption and Unsigncryption in Fig. 7a, Fig. 7b, Fig. 7c, and Fig. 7d respectively show that the execution time and the size of attribute n have a linear correlation. For attribute size

Evaluation of communication cost.

Evaluation of computation time.

Cost of computing and storing comparison matrix value in hashmap.
For the first experiment shown in Fig. 8, we consider the computation cost of constructing hashmap for the B+ tree. The size of the B+ search keys is changed from 100, 200, 300, 400. The result shows the computation time increases with the increment of the size of keys. For the next set of experiments, we use a synthetic database that contains one table with 1000 tuples of records. Each tuple contains multiple attributes, a primary key A and is 500 bytes long. For simplicity, we assume that an index is built on primary key A with page size equal to 5 kilobytes. The experiment results presented here are the average case of 100 trials. We consider the workload of 100, 200, 300, and 400 range queries generated uniformly at random over the domain of A. We consider two set of B+ tree: encrypted B+ tree with embedded digest (EB+) and normal B+ tree (NB+). Figure 9a shows query performance between the two B+ trees. The result shows that there is a reduction in query performance when the B+ tree is encrypted. This happens because the algorithm has to compute the hash value of the query keyword (key) and search key to obtain the correct query result. It also includes the cost of constructing a verification object.
There are two types of update operations: insertion and deletion. Here, we conducted the test on updating the B+ tree that involves either insertions or deletions to reveal the actions of each type of update. Ones that are generated uniformly at random and ones that exhibit a certain degree of locality. Due to lack of space, we present the results for only uniform insertions, deletions work similarly. The result shows that there is a slight difference in the performance of update on EB+ and NB+. In the next set of experiments, we measure the
In summary, our simulated experimental results demonstrate that our PHRs scheme can supports secure sharing of data via cloud computing efficiently and effectively.

Object’s computational cost and
Comparison between our proposed scheme and other related schemes
In this section, we compare our work with some existing relevant schemes on cloud based PHRs sharing. The comparison is shown in Table 2.
Data confidentiality: PHRs are very sensitive data and should not be accessible without correct decrypting keys. In our scheme, the provable database modification: The digest computed by the Fine grained access control: The signcryption is based on attributes and therefore enables the encryptor to define a predicate over attributes and encrypts the data according to the predicate. This allows flexible access to the plaintext dependent on the scope of the access privileges of the decryptor. Unforgeability of ciphertext: The proposed ABSC scheme provides proof of the origin of data via appending of signature of the data encryptor. Blind search: The proposed scheme enables the cloud server to compute on ciphertexts to retrieve the appropriate query result without the knowledge of the plaintext of the search keywords and the ciphertext. Integrity of query results: The integrity of query results (freshness, completeness, and correctness) is achieved by comparing the root digest of the B+ tree originally computed by the Exact match and range queries: Our scheme stores the encrypted PHRs data on the encrypted B+ tree which supports efficient exact-match query and range query.
Conclusion
In this paper, we have proposed a new personal health sharing scheme. The new scheme uses attribute-based signcryption and B+ data structure to achieve privacy protection, data integrity, unforgeability, blindly keyword search, and fine-grained access control. In particular, the new scheme relies on authenticated data structure based on a B+ tree to authenticate the query results. For future works, we will consider a searching mechanism that will enable query on an encrypted database without the cloud server learning the search pattern.
Footnotes
Acknowledgments
This work was partially supported by Sichuan Science and Technology Program (2018HH0102, 2019YFH0014, 2020YFH0030, 2020YFSY0061).
Correctness of unsigncryption algorithm
Suppose
When a legitimate user obtains the message
