Abstract
The transition from paper-based information to Electronic-Health-Records (EHRs) has driven various advancements in the modern healthcare industry. In many cases, patients need to share their EHR with healthcare professionals. Given the sensitive and security-critical nature of EHRs, it is essential to consider the security and privacy issues of storing and sharing EHR. However, existing security solutions excessively encrypt the whole database, thus requiring the entire database to be decrypted for each access request, which is time-consuming. On the other hand, the use of EHR for medical research (e.g., development of precision medicine and diagnostics techniques) and optimisation of practices in healthcare organisations require the EHR to be analysed. To achieve that, they should be easily accessible without compromising the patient’s privacy. In this paper, we propose an efficient technique called E-Tenon that not only securely keeps all EHR publicly accessible but also provides the desired security features. To the best of our knowledge, this is the first work in which an Open Database is used for protecting EHR. The proposed E-Tenon empowers patients to securely share their EHR under their own multi-level, fine-grained access policies. Analyses show that our system outperforms existing solutions in terms of computational complexity.
Introduction
With the rapid development of Health Information Technology (HIT) and cloud services, many healthcare organisations are accelerating the implementation of Electronic Health Record (EHR) based systems. These systems enhance their services and core competencies since EHRs can address many limitations of traditional paper-based medical records, such as scalability, accessibility, and persistence. EHRs are often shared across doctors and healthcare providers with patients’ consent. They typically include sensitive and private information such as patient’s identity codes, health history, medical diagnoses and treatment plans. Leakage of these data can cause embarrassment or even result in life-threatening consequences for patients. Indeed, despite record levels of security spending by different hospitals, there is still a wide range of malicious cyberattacks intended to penetrate databases and connected systems. This is because cybercriminals find EHRs highly profitable, which motivates them to steal such data. Therefore, designing a system that preserves patient privacy robustly and efficiently is imperative.
A naive solution to the above requirements is to use Attribute-Based Encryption (ABE), which provides confidentiality and fine-grained access control. There are two general types of ABE: Ciphertext-Policy Attribute-Based Encryption (CP-ABE) [6] and Key-Policy Attribute-Based Encryption (KP-ABE) [13]. In CP-ABE, data is encrypted with a user-defined access structure, and a user with the relevant attributes can decrypt it [3]. Contrarily to CP-ABE, KP-ABE encrypts data with a set of descriptive attributes, and a user with a key embedded with an appropriate access structure can decrypt the data [3]. In this paper, we focus on CP-ABE. As an example, suppose a data owner (patient) wants to share part of their EHR with a specific healthcare professional with a specific role and responsibility according to a CP-ABE access policy defined by the patient. One doctor is assigned a specific set of attributes (e.g., {“doctor”, “temporary”, “oncology”, “top secret”}) while another doctor may be assigned different attributes (e.g., {“doctor”, “in charge”, “dentistry”, “confidential”}).
A doctor is authorised to access a patient’s EHR data if his/her attribute set satisfies the access policy for that part of the data. Readers may have noticed that naive CP-ABE does not perfectly support multi-level access control, meaning that data owners have to individually set different access policies for different parts of EHRs, depending on the type and sensitivity. As a result, these access policies will introduce many duplicate attributes. The number of access policies is proportional to the number of duplicate attributes. That is, as more access policies are registered in the system, there will be more duplicate attributes, which will undoubtedly increase the storage and communication overheads. Worse still, given that most EHR data is sensitive, encrypting and decrypting large volumes of sensitive EHR using naive CP-ABE can be prohibitively expensive (i.e., it suffers from linear decryption cost [29]), especially for resource-constrained devices. This, however, yields the following questions: (1) Can we retain the benefits of CP-ABE for fine-grained access control while avoiding duplication of attributes when implementing multi-level access control in EHR systems? (2) Can we protect the confidentiality of EHR data without relying on extensive encryption (i.e., securely keep most of the EHR data open/in plaintext)?
Another concern is related to the integrity and authenticity of EHRs. Apart from the information provided by the medical staff, nowadays, more and more health data are collected from connected sensors, and wearable medical devices [12] over (insecure) networks. All of this is patient-centred data, over which the patient has primary control. However: (1) what if the patient takes advantage of their primary control to share/upload false data to the EHR system? (2) What if a medical device exploits its automated nature to upload false data to the EHR system without the patient’s consent? (3) What if a man-in-the-middle intercepts and tampers with the data?
Our contributions
Indeed, strict security requirements appear to be diametrical to the goal of keeping data open. This paper makes a novel attempt to address these seemingly contradicting requirements. It proposes a novel E-Tenon system where data are stored in an open database while maintaining all privacy and security properties.

Overview of the proposed tenon database. A conventional table will be segmented into a series of sub-tables where the relationship between rows is hidden. It can be revealed partially or fully, depending on the data user’s attributes (access rights).
One of the core components of the proposed system is the Tenon database (TDB), whose overview is presented in Fig. 1. Unlike conventional databases, the TDB is an open database consisting of a series of public tables and one secret table. Its main advantage lies in the fact that data protection does not depend on heavy encryption and decryption. Instead, the protection of EHRs is achieved through data preprocessing, maintenance of secret relationships between EHR blocks, and shuffling techniques. Notably, EHRs will be classified into identifiable information and Non Personally Identifiable Information (Non-PII), the latter of which will be tokenised into EHR blocks and can be securely made public. In addition, EHRs in the TDB are constantly shuffled, which makes it extremely difficult for attackers to exploit the open data. The main contributions of this paper are summarised as follows:
We design an efficient open database where the majority of the data is open, minimising encryption operations.
We propose a novel mechanism that does not rely on any suppression and generalisation techniques used in the existing schemes (such as k-Anonymity), where a suppression may lead to data loss and reduced usability, while generalisation may overlook some details about the data.
We present data preprocessing and shuffling methods used in conjunction with the proposed E-Tenon system to store and share EHRs securely in an open database setting.
We show how to ensure that a medical device and a data owner sign the same content, even if the EHRs have been preprocessed. This guarantees the authenticity and integrity of EHRs.
Our work addresses the shortcomings of previous solutions since E-Tenon not only efficiently guarantees multi-level, fine-grained EHR-data sharing but also protects the integrity and authenticity of the EHR, most importantly, under an open database setting. It takes only 2.34 milliseconds for signing and verifying the signature, and 0.14 and 0.76 seconds for encryption and decryption of the secret relationship, respectively. To the best of our knowledge, E-Tenon is the first open database-based scheme to provide such a wide range of security and privacy properties. Note that while this work focuses on EHR, the concept of E-Tenon would also be applicable in other scenarios requiring low-latency access to user data, such as in mobile edge computing environments.
Since medical data security has become a growing public concern, a considerable number of schemes have been published for secure medical data sharing and privacy preservation [2,4,19,21,23,29,31,33,34,38,40–42]. For instance, most research in protecting medical data have emphasised the use of cryptographic methods such as CP-ABE and KP-ABE [4,14,38,40–42]. The system architecture proposed in [31] is based on a successor of CP-ABE and Role-Based Access Control (RBAC) to protect EHR stored in the hybrid cloud with direct and indirect access. In Li et al.’s KP-ABE-based model [21], the data owner needs to trust the key issuer because they are only inserting a set of descriptive attributes into the data using KP-ABE, but they do not know who will be accessing their data [6]. Xu et al. [41] presented a practical dual-policy ABE scheme for EHR systems that combines the advantages of CP-ABE and KP-ABE with support for user revocation. In addition, Belguith et al. [4] proposed a multi-authority CP-ABE scheme that delegates expensive computing tasks to cloud servers, and their scheme also prevents collusion between the authorities.
Nevertheless, no existing solution in the literature is designed for private database settings that can ensure EHR security and patient privacy while keeping data in plaintext form. When every second matters during an emergency, the time-consuming encryption and decryption operations in a healthcare information system may cause delays in accessing patient information (such as medical history) during the golden hour that saves a patient’s life. Likewise, as argued in [11], excessive security may obstruct sensible data use by healthcare providers and patients. Most approaches have failed to properly weigh the patients’ right to privacy against the legitimate sharing of data. Alternatively, we also analysed the feasibility of applying anonymisation techniques such as t-closeness and Attribute-Based Credentials (ABCs) [8,22,25] under open database environments. T-closeness is a privacy-preserving technique used for anonymising datasets to prevent the identification of specific individuals based on certain attributes (such as age, income, and medical conditions). It prevents individuals from being identified on the basis of their sensitive attributes by ensuring that the distribution of a sensitive attribute in a group of records is similar to its distribution in the population as a whole. On the other hand, ABCs are a digital identity verification technique that allows users to verify their identities to third parties without revealing too much personal identity information. However, our proposed scheme aims to achieve efficient EHR sharing in a multi-party healthcare setting, and these two techniques are not directly applicable to our proposed solution. Specifically, t-closeness provides a means of anonymising data to prevent the disclosure of sensitive information. In contrast, ABCs provides user authentication, but neither technique provides fine-grained access control or multi-party data sharing, which are key features of our proposed solution. Moreover, although existing anonymisation techniques such as k-anonymity [35] and l-diversity [27] have been extensively studied theoretically and empirically, these widely-adopted principles are still insufficient to prevent attribute disclosure if the attacker has partial knowledge about the overall sensitive data distribution. On the other hand, although the t-closeness principle has been proposed to address this problem, it can only support sensitive numerical attributes. In addition, most state-of-the-art suppression or generalisation-based anonymisation techniques intentionally remove some parts of the attributes from the database in order to make a particular attribute private, but this is prohibited in our system as we do not remove or modify any patient data. Considering the goals and characteristics of our proposed scheme, we believe these anonymisation solutions are not the optimal choices when compared with ABE techniques.
Although several similar works mentioned above have used ABE to protect EHR, which is promising for flexible and fine-grained EHR sharing, they are computationally intensive when applied to encrypt the entire database. In addition, most solutions cannot support searching over encrypted data directly. Consequently, to search for relevant patient data in an encrypted database, the system first needs to decrypt the data on the application back-end. Such a burdensome process wastes valuable computing resources. Furthermore, many schemes fail to use digital signatures to ensure data integrity and authenticity properly. For example, [42] allows only one entity to sign the EHR, which grants the entity too much power. Despite some schemes [39] allowing multiple entities to sign the data, they cannot guarantee that the same content is being signed honestly by all participants.
To our knowledge, no state-of-the-art work on sharing and protecting EHRs has considered using a secure open database to save the avoidable overhead of encryption and decryption. That said, as the current solutions are built on private databases by default, we are unable to find related work that fully meets our expectations.
Organisation
The rest of the paper is organised as follows. Section 2 introduces and recapitulates the required mathematical notations, security assumptions and related schemes. Section 3 presents the system model and the corresponding adversarial model. This is followed by the construction of E-Tenon, given in detail in Section 4. Next, we prove the security and practicality of the proposed scheme by conducting security and performance analysis in Sections 5 and 6, respectively. Section 7 of the paper concludes our work in light of all that has been mentioned.
Preliminaries
This section introduces and recapitulates several prerequisites, including definitions of some mathematical notations, a multi-level ABE scheme, and a multi-signature scheme.
Notations
We use
Building blocks
More formal definitions are provided below. Bilinear maps are a helpful tool for pairing-based cryptography because they conveniently establish relationships between cryptographic groups. As cyclic groups are used in the bilinear map, we first introduce the definition of a cyclic group.
(Cyclic Group of Prime Order [5,32]).
Let
(Prime Order Bilinear Group [6]).
Let
There are three properties of an efficiently-computable e that are worth noting:
Bilinearity: Non-degeneracy: Computability: for all Let
(Discrete Logarithm Assumption [5]).
Multi-level CP-ABE
ABE is an outstanding example of flexible and scalable encryption mechanisms for multiple users in recent years. It enables contextualised decision-making thanks to the introduction of the concept of attributes. In KP-ABE [3,13], data owners have a set of attributes closely linked to themselves that can be selectively applied to encrypt their data. Any other user who intends to decrypt the ciphertext first needs to be issued a key bundled with a suitable access structure by the trusted key issuer. In contrast, the data owner gains more control in CP-ABE regarding who can access his/her data, through the design of an access policy embedded in the ciphertext [6]. The only users who can access and decrypt this data are those with the appropriate attributes. Therefore, CP-ABE is probably better suited for data outsourcing, especially when it is used to preserve patients’ privacy, even in emergencies, due to its flexible nature. However, in the challenging field of E-health, the standard CP-ABE is not perfectly compatible with the reality of the intertwined doctor-patient relationships among different healthcare organisations. This is because each distinct part of the EHR file may require to be accessed with completely different access rights depending on the purpose of the data user. Therefore, the naive CP-ABE is not fully compatible in our scenario. Fortunately, as one of the successors to CP-ABE, ML-ABE fills in the gaps and imperfections described above. ML-ABE consists of four algorithms (setup, encrypt, keygen, decrypt):
Let (ML-ABE [17]).
(Access Structure [17]).
Multi-signature
A Multi-Signature (MS) solution allows a group of signers to co-sign on a shared document in a compact manner [5]. To provide a real-life example, publishing a report/document often requires the cooperation of multiple colleagues. In order to guarantee the authenticity of the report, each participant must sign the file. Therefore, Multi-Signature technology is used to fulfil this type of requirement in the electronic world. Besides, the ABE approach described in the previous section has already reduced the cost of key management by providing one-to-many encrypted access control [20]. Thus, we prefer to use a Multi-Signature scheme that is not based on comparatively more burdensome requirements of PKI (e.g., knowledge of secret key hypothesis [7]) to enhance the practicality of the proposed E-Tenon system further. Bellare and Neven’s MS-BN [5] defined below fits well with our concept.
(MS-BN [5]).
MS-BN is a scheme consisting of four randomised algorithms (Pg, Kg, Sign, Vf):
Here we stress two essential facts about MS-BN. First, the security of this scheme is guaranteed on the assumption that at least one of the signers is honest [5, Section 4]. Second, the Kg algorithm of MS-BN is run independently by each signer to generate the key pair. Such an assumption leads to a security breach when all the signers are honest-but-curious or dishonest. Given the increasing sophistication of cyber attacks, any end-user can no longer be undoubtedly trusted. Hence, our model will strengthen MS-BN to accommodate cases where no particular signer is fully trusted. To achieve that, we do not allow the non-trusted signer to perform the Kg algorithm without the support of a trusted entity. In other words, the secret keys required for the user to operate the ABE and Multi-Signature related algorithms will be issued by an Attribute Authority (AA) at once where necessary.

System model of the proposed scheme.
We propose the Electronic Tenon System (E-Tenon), depicted in Fig. 2, which effectively integrates Multi-level CP-ABE and Multi-Signature techniques. Our novel innovations and extensions enable these existing technologies to function optimally within open database environments. To our knowledge, current ABE-based privacy-preserving systems incur significant encryption and decryption overheads. However, our solution confidently allows EHRs to be securely opened in the database after special preprocessing. Specifically, EHR blocks stored within the database can only be mapped into meaningful information by deciphering relevant secret pointers. Data shuffling techniques are employed to constantly change the position and order of EHR blocks, ensuring that open data is randomly presented to data users each time the database is accessed. Furthermore, in the original MS-BN scheme, there must be a trusted signing entity involved in the signing process, but we cannot assume that this will be feasible in safety-critical applications. Therefore, our solution does not require a fully trusted signer to ensure multi-signature unforgeability, making our E-Tenon system more flexible and practical. Eventually, we present steps grounded on sound logic in this paper to guarantee that the service providers and data owners can consistently sign the same message. These features empower us to manage EHRs efficiently, flexibly and granularly while preserving privacy and security.
System and adversarial model
In this section, we provide a high-level overview of the proposed system model with respect to entities involved in E-Tenon. Afterwards, we analyse security considerations along with an adversarial model.
System model
To establish the system model, we first introduce an efficient open database, then we merge and extend a Multi-Signature scheme MS-BN [5] with an encryption scheme ML-ABE [17]. Our system (as depicted in Fig. 2) ends up with three distinct phases: SETUP, ACCUMULATION and RETRIEVAL, along with seven secure algorithms. In addition, there are six crucial entities: Central Trusted Authority (CTA), Attribute Authority (AA), Data Owner (DO), Service Provider (SP), Data User (DU), and Tenon Database (TDB). Besides, we allow for the option of a seventh participant: Distributed Data Consistency Monitor (DDCM).

An example of mortise and tenon joints.
E-Tenon is intended to be used by patients and a wide range of healthcare institutions. The novelty lies in the fact that most of the EHRs in the TDB are publicly accessible. Besides, we do not restrict EHRs to be transferred only within private networks such as the corporate Local Area Network. Accordingly, the vast majority of EHRs can be transmitted through untrusted public networks such as the Internet. While these considerations significantly increase the applicability and efficiency of the model, they also expose system interactions and EHRs in transit to various malicious cyber attackers. Therefore, our system must defend against the following threats:
Security assumptions
Some of the key assumptions are summarised as follows:
DOs and DUs are expected to be educated about privacy rights and obligations. Thus, they will not actively disclose any confidential information to unaffiliated and unauthorised third parties. DOs can apply appropriate access policies to different categories of EHRs according to a layman-friendly guidebook provided by the administrator. The semi-trusted TDB and unauthorised DUs cannot infer the data type of EHRs when each data category contains at least κ different data types.
Security games
Based on the system and adversarial models, we consider the following security games to define the security notion of our E-Tenon system.
1) To prove that E-Tenon is secure against confidentiality and privacy threats, we define an IND-CCA-1 security game between a challenger ML-ABE is CCA-1 secure against confidentiality and privacy threats, if for all PPT adversaries, there is a negligible function in winning the security game defined above, such that
MS-BN is MU-UF-CMA secure against integrity and authenticity threats, if for all PPT adversaries, there is a negligible function in winning the security game defined above, such that
Our system incorporates three important phases and seven secure algorithms. We describe the construction details of each phase separately, with further specifications in the following subsections.

Workflow of the proposed scheme.
The workflow of our E-Tenon system is presented in Fig. 4 where green entities are fully trusted, red entities can be malicious, and blue entities are honest-but-curious. During the SETUP phase, the CTA and AA will generate and issue the public parameters, attributes and keys required by all system users. In the next stage, named ACCUMULATION, a total of four fundamental algorithms are used. Before the secret relationships between EHR blocks are established, they can be classified into two main categories: identifiable and Non-PII data, based on the attributes (such as social security number, medical record number) listed in Health Insurance Portability and Accountability Act (HIPAA). The EHR preprocessing algorithm will detect any HIPAA attributes; these can be tokenised into smaller chunks on a specific level to render them unidentifiable, or minor encryption can be applied to the detected HIPAA identifiers if requested by the data owner. Then, HIPAA identifiers and Non-PII data will be made open after preprocessing as it can not be used to trace a patient’s identity without the ability to read and understand the secret relationships between EHR blocks. Note that when encryption is performed with a patient-defined access policy, it is equivalent to the patient giving consent to those users who satisfy the access policy. Upon multi-signing the data by the DO and SP, the TDB may refuse to store the data if the signature is invalid or forged. Apart from this, signers may also refuse to sign if they believe the data is illegally modified. At the final RETRIEVAL stage, the DUs also have the option to verify the data’s signature. They can decrypt the pointers at different security levels according to their attributes when they believe the signature is legitimate. Then, the decrypted pointers can be used to find and combine the relevant EHR blocks in the proper order to recover the correct information.
Notations and cryptographic functions
Notations and cryptographic functions
Table 1 lists some essential notations and cryptographic functions we used. Let λ be the implicit security parameter that denotes the size of the cryptographic groups, and let
ACCUMULATION phase
In order to understand what must be encrypted and left open, we need to consider how data may be combined. For instance, an insecure combination is the National Insurance Number (NINO) with the medical condition since it reveals the patient’s identity. However, blood pressure and symptoms can be seen as a safe combination. But it is noted that although the knowledge of a single symptom is not helpful in revealing a patient’s identity (e.g., almost everyone may have a cough), detailed symptom information can be useful in inferring a patient’s identity (e.g., it may be rare for a person to have a nosebleed, cough, fever and heart pain at the same time).

dataPreprocessing(Φ,
To give a more intuitive example, we note that the length of each token

Rounds of communication in multiSign algorithm (DO stands for data owner and SP stands for service provider).
Once the DU confirms that the accompanying multi-signature is not a forgery, he/she can call the following algorithm to decrypt the ciphertext hierarchically. Please note that the higher the access rights represented by the DU’s attributes, the larger the number of pointers that can be revealed.

Example of working principle of the Tenon database.
In this subsection, we explain the working principle of the TDB that forms one of the key components in the proposed E-Tenon system. As seen visually in the left part of Fig. 6, the TDB comprises several open tables and one secret table. The open table has three columns per row: pointer, EHR block and multi-signature. It is worth noting that all encrypted data are separated from the open table. This is because we have adopted a multi-level ABE that produces a ciphertext containing multiple encrypted pointers. To reconstruct the data in the open tables, the authorised DU first decrypts the outer layer of the ciphertext. If successful, they will be presented with a series of encrypted pointers, and the number of pointers that can be decrypted depends on the DU’s attributes. In this context, each row in the open table should not contain any encrypted pointers because this compromises the data confidentiality once a low privileged DU decrypts the outer ciphertext. Namely, an adversary can effortlessly use the encrypted pointers to locate the rows containing these pointers in the misconfigured open tables and directly combine them without the need to decrypt the secret pointers according to his/her attributes. Therefore, we collectively store all secret pointers accompanied by their multi-signature in a protected table isolated from other public tables. A legitimate DU can only read the entries that he/she is granted access to read. Moreover, the malicious outsider will not be able to see all the encrypted pointers and the malicious insider who can decrypt the outer layer of ciphertext will not be able to exploit the internal encrypted pointers to infer any information in the TDB.
Besides, we propose a complementary shuffling mechanism to reduce further the risk of any entity learning any information from the open data stored in the TDB. As demonstrated in the right part of Fig. 6, the TDB constantly shuffles the data to ensure that the order of the data is different each time the user accesses the TDB. Nevertheless, there is a possibility that the order of the data remains unchanged after the shuffle. If such a corner case occurs, the TDB will be automatically re-shuffled. This can be achieved by running a deterministic algorithm that compares the hash of the current data order with the hash of the previous data order. The algorithm returns ⊥ when the shuffled data order is accidentally the same as the original data order. Thus, the TDB needs to re-shuffle the data to avoid this problem. These will further enhance the security of TDB and leave attackers with no rules to follow.
Signing process
We use multi-signature to place constraints between the SP and the DO. This allows the DO to confirm that the EHR obtained from the SP is valid. On the other side, the SP can ensure that the DO has not attempted to alter the original EHRs they provided. It is therefore possible to guarantee the integrity and authenticity of the EHR if they have agreed to sign together on the same message.
The following describes two issues we need to address when signing. Firstly, imagine a signature obtained by encrypting the hash of a message generated via a one-way hash function. This signature is said to be valid if the hash value generated by the verifier using the same hash function on the accompanying message is equivalent to the hash obtained by decrypting the signature provided by the signer. Such a signing and verification process establishes the integrity of the message but does not maintain its confidentiality since the message used to generate the hash is in its original form [1]. The second issue is how the SP and DO sign the same content when there are inconsistencies between the data held by the SP and DO after preprocessing the EHRs. To address these issues, we propose the following steps for signers to securely multi-sign the same content. A visualisation of the process is provided (see Fig. 7).

How data owners and service providers can regulate each other to ensure the accuracy and integrity of EHRs.
In this section, we analyse and prove the security of our proposed scheme formally against the adversarial model described in Section 3. To ensure that E-Tenon is secure and resilient to a range of possible attacks, ML-ABE (a variant of CP-ABE) and MS-BN (a variant of Schnorr signature) are selected and integrated for reliability and validity. First, we note that ML-ABE is a proven CCA-1 secure scheme, where CCA-1 refers to the non-adaptive chosen-ciphertext attacks. Second, MS-BN is a proven secure scheme against the multi-user unforgeability against chosen message attacks (MU-UF-CMA). Our E-Tenon scheme should naturally inherit the security properties of these two building blocks. Assume that the ML-ABE scheme in [
17
] is selectively CCA-1 secure. Then, the E-Tenon system preserves confidentiality and is selectively CCA-1 secure with respect to the CCA-1 security game and Definition
7
. To prove the security of the E-Tenon system with respect to Definition 7, we consider there exist two polynomial-time adversaries In order to determine the adversary’s advantage at this stage, some basic observations are necessary to be made. It is noted that the element
Assume that the ML-ABE scheme in [
17
] is private against both malicious and honest-but-curious adversaries. Then, the proposed E-Tenon system preserves privacy against both malicious DU and honest-but-curious TDB.
In this proof, we consider attacks from a malicious DU and an honest-but-curious TDB, respectively. First of all, it is worth noting that the malicious adversary DU will have the same advantage as in In another scenario, let us assume that the honest-but-curious TDB complies with its obligations. However, it tries to reveal which DO upload the EHR or which DU requested to retrieve the EHR. This clearly compromises the privacy property. Having said that, we show that the TDB does not have the ability to distinguish requesters by their attributes. Suppose Assume that the MS-BN scheme in [
5
] is MU-UF-CMA secure. Then the proposed E-Tenon system is MU-UF-CMA secure with respect to the MU-UF-CMA security game and Definition
8
. Let As proved in [5], breaking the MS-BN model is considered to be at least as hard as the discrete logarithm problem (DLP) for an adversary
In this section, we discuss the performance of the proposed model. We first compare our scheme with other competitive solutions in terms of security properties. We then evaluate the relevant computation cost of the E-Tenon in different tasks. Subsequently, we discuss the communication and storage costs of E-Tenon.
Security properties and functionalities comparison with related works
Security properties and functionalities comparison with related works
✓: Fully Satisfied; ✗: Not Satisfied; ◑: Partially Satisfied; -: N/A.
SP1: Open Database; SP2: Secure-channel Free; SP3: Data Confidentiality; SP4: Data Integrity; SP5: Non-Repudiation; SP6: User Privacy; SP7: Collusion Resistance; SP8: Multi-level Access Control; SP9: Fine-grained Access Control; SP10: Process Transparency.
To compare security properties and functionalities, we have selected several state-of-the-art schemes ([10,14,15,28,34,42]) for protecting EHRs and compared them on various dimensions. A summarised comparison of the security properties and characteristics of the schemes is presented in Table 2. Although there are wide-ranging interesting solutions, they still suffer from different shortcomings and do not work efficiently where open databases are concerned. The scheme proposed by Sun et al. [34] employs attribute-based techniques, but the patient’s involvement in the encryption and signing of the data is weakened. In [34], the patient does not have the right to specify the access policy of their own data. Further, the doctor is responsible for encrypting and signing the data, meaning that the doctor has direct control over the data, rather than the patient. Such a design increases the advantage for malicious insiders and makes the system less trustworthy for patients. In contrast, our E-Tenon system inherently gives more control to the patients since they are the actual owner of the EHR. In this way, they can set different levels of access policies for different types of data on their own, and they are allowed to engage in the process of Multi-Signature.
Green et al. [14] have attempted to reduce the user’s computational overheads by outsourcing the task of decryption to an untrusted cloud service provider (CSP). In their system, the CSP transforms ABE’s ciphertext into a simple El Gamal-style ciphertext based on a transformation key provided by the data user. Despite the converted ciphertext requiring lower computational cost than its initial form when recovering the plaintext, the user cannot verify that the CSP has performed the transformation operation honestly. In [7]] the author(s) proposed three different signature schemes, those can ensure confidentiality, integrity and non-repudiation services. Similarly, the scheme presented in [10] ensures unlinkability of the stored data by converting identifying attributes into non-sensitive pseudonyms. However, this process is not transparent, meaning the data owner cannot audit their data flow. By comparison, the data pre-processing algorithm in our system is run on the data owner’s side, and there is no need for other central entities to perform any secondary processing of the uploaded EHRs. Besides, instead of using a basic form of digital signature, we utilise multi-signature technology, which allows a group of participants to co-sign the same message effectively. This naturally enables the patient (DO) and the service provider (SP) to restrain each other’s dishonest behaviour. Thus, it further enhances integrity, authenticity and non-repudiation. In this regard, we emphasise that multi-signature is more promising than the standard digital signature or other techniques that involve many signatures. Because in the absence of multiple entities constraining each other, the entity accessing the EHR later can replace the EHR provided by the previous entity and continue to sign the EHR supplied by itself. Therefore, the conventional signature approach does not guarantee the authenticity of the EHR in a collaborative environment.
Furthermore, Huang et al.’s solution [15] focuses specifically on EHR confidentiality, although their solution is not security channel-free and we found no discussion of how they ensure EHR integrity, which makes their solution slightly less than perfect in our comparison. However, system proposed by Zhang et al. [42] (SSH) and system proposed by Maffei et al. [28] (GORAM) satisfied most of the security properties. GORAM allows data owners to share their data stored in the cloud selectively, and the storing entity is not permitted to inspect any data. Nevertheless, their robust security comes at the cost of increasing the ciphertext size and slowing down encryption and decryption.
Finally, we observe that none of those mentioned above schemes can be applied to public databases where most of the data is stored in plaintext, and none of the encryption methods used in these schemes can efficiently implement multi-level access control. On the contrary, thanks to the novel concept of E-Tenon, our data is securely stored in an open database (TDB), which means the computational overhead on encryption and decryption is minimal compared to solutions based on heavy encryption.
Computation cost
We use a virtual machine (Ubuntu 12.04) with an Intel Core i5-4200M dual-core 2.50 GHz CPU to conduct simulations of the core operations based on three main libraries: JPBC library Pbc-05.14 [26], JCE library [30] and Apache Commons IO library [36]. We test modular exponentiation, multiplication and bilinear pairing 2,000 times and take the average CPU time in milliseconds. Regarding the dataset, we used the MIMIC-III v1.3 dataset [16], a freely accessible healthcare database offered by the Massachusetts Institute of Technology. This dataset contains de-identified electronic health records (EHRs) of over 30,000 unique patients. However, since the dataset does not contain any sensitive or personally identifiable information such as names, addresses, phone numbers, or emails, we considered a number of the randomly selected columns to be sensitive attributes/identifiers for our experiments. This approach allowed us to achieve our experimental objectives without modifying the original dataset. We also analyze the performance of the proposed scheme based on another common dataset: the eICU Collaborative Research Database, which consists of 31 Tables, each containing 8–10 columns. Since our proposed scheme does not encrypt the whole dataset, it hides the relationship among the data, which depends on the number of sensitive attributes present in each table. Our analysis shows that the proposed performance is quite similar to the MIMIC-III v1.3 dataset (as shown in Fig. 9).
In conducting the comparison, we found it difficult to find schemes with similar security properties and performance metrics, especially in an open database environment, for a fully fair comparison. We acknowledge that the system proposed by Zhang et al. [42] (SSH) and the system proposed by Maffei et al. [28] (GORAM) have similar security properties to ours and could be considered candidates for comparison. However, we would like to point out that [42] mainly relies on aggregate signature and an anonymous CP-ABE technique, while [28] (GORAM) does not use similar attribute-based techniques and signature schemes but instead uses batched zero-knowledge proofs of shuffle and an accountability technique based on chameleon signatures. Our scheme, on the other hand, is based on multi-signature and multi-level CP-ABE. Therefore, a focused comparison with [42] would be fair since our scheme and [42] share similar techniques, although the details of our schemes differ significantly.
Performance benchmarking in terms of computation, communication and storage cost at DO, DU, SP, TDB’s side
Performance benchmarking in terms of computation, communication and storage cost at DO, DU, SP, TDB’s side
Table 3 shows the cost at data owner, data user, service provider and database side for signing, verification, and encryption and decryption. Firstly, the signing and verification algorithms adapted in our model outperform other relevant algorithms in the state-of-the-art schemes [7,24,42]. This is because only one exponentiation operation is required when an entity signs/verifies the message (the average CPU time for 2000 trials is approximately equal to only 2.34

Performance comparison based on computation, communication and storage cost of MIMIC-III dataset (using simulation parameters specified in Table 3). Note that in figure (a): signing and figure (e): signature, there is some overlapping between two different results due to the same operation time.

Performance comparison based on computation, communication and storage cost of eICU dataset (using simulation parameters specified in Table 3). Note that in figure (a): signing and figure (e): signature, there is some overlapping between two different results due to the same operation time.
Finally, we analyse the communication and storage costs of the proposed protocol. As mentioned above, the access structure used by E-Tenon is designed in an aggregated manner, and the cost of our scheme in terms of communication and storage is optimised by eliminating duplicate attributes. This implies that the ciphertext size in the E-Tenon system is shorter than other schemes with a series of separate access structures. However, our protocol requires an extra round of communication during the signing process compared to other schemes, a trade-off for supporting concurrent signing in the multi-user environment, as pointed out in MS-BN [5]. That being said, the size of our signature is only
Conclusion and future work
This paper proposed an efficient privacy-preserving open data-sharing scheme for a secure EHR system. The idea of keeping most of the data open without compromising security and privacy is considered a novel attempt in this field. Moreover, we presented in detail the effective integration of two promising technologies in our E-Tenon system: ML-ABE and Multi-Signature, to protect the security of EHR and patient privacy. Our solution exploits the advantages of ABE for key management and multiple signatures for protecting the authenticity and integrity of EHR. The multi-level security supported by ML-ABE allows us to protect the relationships between EHR blocks independently with different levels of security, where only legitimate DU with appropriate attributes can decrypt a certain number of pointers and sensibly join the open data. These not only improve the security of EHR but also grant patients the ability to share EHRs efficiently. In addition, with the formal security analysis, our solutions have been proven to be capable of preventing a range of possible security attacks. Finally, we have analysed the costs and performance of the E-Tenon system in various aspects. The simulation results show that our E-Tenon system does not compromise security properties while maintaining promising efficiency and flexibility.
