Abstract
Gated Recurrent Unit (GRU) has wide application fields, such as sentiment analysis, speech recognition, and other sequential data processing. For efficient prediction, a growing number of model owners choose to deploy the trained GRU models through the machine-learning-as-a-service method (MLaaS). However, deploying a GRU model in cloud generates privacy issues for both model owners and prediction clients. This paper presents the architecture of PrivGRU and designs the privacy-preserving protocols to complete the secure inference. The protocols include base protocols and principal protocols. Base protocols define basic linear and non-linear computations, while principal protocols construct the gating mechanisms of GRUs. The main benefit of PrivGRU is to address privacy problems while enjoying the efficiency and convenience of MLaaS. The overall secure inference is performed on shares, which retain two properties of security: correctness and privacy. To prove the security, this work adopts Universal Composability (UC) framework with the honest-but-curious corruption model. As each protocol is proved to UC-realize the ideal functionality, it can be arbitrarily composed in any manner. This strong security feature makes PrivGRU more flexible and practical in future implementation.
Introduction
Different types of machine learning algorithms have widely implemented in many fields, such as financial, healthcare, and security [2]. As deep learning algorithms emerge, Deep Neural Network (DNN) has achieved promising results in various applications. DNN includes three main types: Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Multi-Layer Perceptron (MLP). It is noted that CNN has made great success in image-related tasks, but it has shortcomings for processing sequential data. With regard to the natural language processing tasks, CNN is faced with the problem of learning contextual information precisely because the natural language has variable length [4]. RNN can deal with variable lengths of text sequence in natural language processing tasks and are equipped with internal memory mechanism.
In 2014, GRU, one of the improved versions of RNN, is introduced in [12] to improve the memory consumption and efficiency of long short-term memory (LSTM). In addition to natural language processing (NLP) tasks, it also exists other applications adopting GRU instead of CNN. Many works have proved that GRU performs better than other neural networks in specific fields, such as sentiment analysis [4], spam detection [30], traffic flow prediction [17], and even malware classification [1].
To bring the trained models into practical usage, machine-learning-as-a-service (MLaaS) has become a new buzzword in recent years. The low deployment costs and high computational performances are appealing to model owners. However, as the model owners benefit from effective computational service through MLaaS, they may take the risk of leaking their intellectual property, such as invaluable model weights. On the other hands, it is convenient for clients to upload their data and get the prediction results, but the data may contain confidential or private information which individuals or companies do not want to reveal. Take malware classification as an example, the malware data may contain confidential information about the security breach or hacker campaign which is a secret to individuals or companies. Therefore, the main purpose of this paper is to eliminate data privacy concern while deploying GRU model in cloud.
This paper presents PrivGRU, a system implementing the privacy-preserving inference of GRU. To ensure the privacy of data and model weights, the real values are additively shared between parties so that no individual party owns the private data. PrivGRU extends the scalability of SecureNN [35], which is empirically proved as the most efficient state-of-the-art work of privacy-preserving neural networks. Originally, the protocols in SecureNN [35] are mainly designed for CNN. As GRU is composed of different gates with various linear and non-linear computations [41], this paper designs base protocols for basic computations and principal protocols for GRU gating mechanisms. Base protocols include
In summary, the main contributions are listed below: Present the PrivGRU architecture to address the contradiction between data privacy and efficiency while deploying trained GRU model in cloud. Design base protocols and principal protocols for secure inference of GRU model using additive secret sharing. Show our protocols UC-realize corresponding ideal functionalities and infer that the arbitrarily-composed GRU model is also UC-secure.
The remainder of this paper is organized as follows. Section 2 reviews previous works for privacy-preserving machine learning methods. Section 3 describes preliminaries about GRU, additive secret sharing, and UC framework. The architecture of PrivGRU is presented in Section 4 and the privacy-preserving protocols are explained in Section 5. Section 6 demonstrates the security analysis. Finally, the last section concludes the paper and suggests ideas for future works.
Related works
Privacy-preserving machine learning techniques
With the development of cloud computing, how to protect sensitive data has become an important issue [9, 38]. Especially, privacy-preserving computation has been studied in many papers, which includes perturbation methods or cryptographic protocols. Differential privacy [16] is one of the main perturbation methods for privacy-preserving machine learning, which can be further categorized as output perturbation and objective perturbation mechanisms [21]. The former methods add noise to data before publishing, while the latter mechanisms utilize the noisy function as the machine learning model. However, the two above methods may lead to the decreased accuracy of data mining or machine learning [37].
Cryptographic protocols can preserve the original model accuracy. State-of-the-art research tends to mix different cryptographic primitives to support linear and non-linear computations. Commonly adopted techniques come in three folds: homomorphic encryption, secret sharing [5, 34], and garbled circuits [39]. Garbled circuits are largely used for computing secure non-linear functions in machine learning algorithms [22, 32]. Although garbled circuits can achieve secure non-linear function evaluation, the trade-off of heavy computation is inefficiency. SecureNN [35] improves the efficiency of non-linear computations by replacing garbled circuits with secret sharing methodology. In most of these papers, the authors implement homomorphic encryption [22, 25] or secret sharing [25, 35] for secure linear computations.
State-of-the-art privacy-preserving neural networks
Many state-of-the-art privacy-preserving neural networks have implemented the operations of CNN, and experimented well on MNIST data for hand-written digit image recognition [25, 35]. However, for the scenario of text representation learning like NLP, RNN is often adopted. In [40], the authors conduct a systematic comparison of CNN and RNN on a wide range of NLP tasks and give the conclusion that the more important the semantic of the sentence is, the more suitable it is to adopt RNN [19, 33]. In addition, RNN algorithms are stated as the backend techniques of Apple Siri [8] and Google voice transcription [11].
In [13], the authors present a framework for privacy-preserving detection of hate speech in text messages with secure multi-party computation. However, they mainly design the protocols for Logistic Regression (LR) or Adaboost model instead of GRU. Other papers focusing on text representation learning evaluate GRU on differentially private data to ensure privacy [3, 23]. Using ε-differential privacy should strike a balance between utility and privacy protection by ε. To the best of our knowledge, there is still seldom research designing cryptographic protocols for privacy-preserving GRU.
As efficiency and accuracy are the important factors for future implementation, this paper presents PrivGRU using additive secret-sharing methods. Protocols in SecureNN [35] are mainly designed for three layers of CNN (convolutional layer, pooling layer, and fully-connected layer) and the ReLU activation function. GRU has different requirements for secure protocols, which are not defined in SecureNN paper [35]. Therefore, our main purpose is to design suitable building blocks for GRU that can preserve privacy in MLaaS fashion.
Preliminaries
Gated recurrent unit (GRU)
GRU is one of the variants of RNN [12]. The designer proposes a new type of hidden unit activated by the update gate and reset gate. The functionality of the hidden unit is to remember context information in sequential data and the architecture of GRU can solve vanishing gradient problems. A complete GRU layer is composed of a series of units, and the following Eq. 1 to Eq. 4 demonstrates the basic components within j-th unit. Initially, the current feature and previous hidden state are acted as the unit input to calculate the current hidden state. The update gate
The equation of the reset gate seems similar to the update gate, but the usage and the weights are different. The reset gate is utilized in calculating the current memory content
Secret sharing was introduced by Shamir [34] and Blakley [5] in 1979. It has been implemented in many scenarios, such as bitcoin signatures [36], digital rights management [26], and privacy-preserving machine learning [25, 35]. The overall computations of GRU can be divided into linear computations and non-linear computations. For linear computations, addition and subtraction between secrets or secrets with plaintext value can be done in the local side of share owners; multiplication or division between shares should depend on secure computation protocols [35]. For non-linear computations, neural networks require different activation functions. Depending on the gating mechanisms in GRU, this paper designs privacy-preserving protocols for Sigmoid and Tanh activation based on 2-out-of-2 additive secret sharing.
Universal composability (UC) framework
UC framework is presented by Canetti et al. [6, 7], which gives very strong definitions of security that can support the modularity of protocols. The main difference between the UC framework and the stand-alone model like simulation paradigm lies in the concurrent interaction during the execution of protocol. UC framework considers threats coming from the execution environment, any malicious requests from the adversary, and concurrent execution with other protocols. An additional entity, environment, is added as a distinguisher to determine if the protocols are UC-secure. To prove a protocol
Protocols that UC-realize corresponding functionalities can securely combine with other UC-secure protocols regardless of the environment or other concurrent executing processes. The formal description of universal composition theorem is presented in Definition 3 [7]. Typically, the universal composition operation can be viewed as âĂIJsubroutine substitutionâĂİž [7]. If
PrivGRU architecture
This paper explores the techniques in [35] and designs the protocols especially for GRU. The architecture of PrivGRU is shown in Fig. 1. This figure takes sentiment analysis as an example to explain the secure inference process. The main roles include three parts: the model owner, prediction client, and the secure inference servers in cloud. Most of the computations in PrivGRU are based on massive matrix computations, and each element in matrices is represented over the ring

Our PrivGRU high-level architecture.
A basic GRU model may contain the embedding layer, GRU layer, and fully-connected layer shown in Fig. 2. The trained weights from these layers are generally denoted as

Atomic operations of GRU secure inference.
When prediction clients have the requirements of model prediction, they convert the documents to mathematical representation and generate the shares
Secure inference servers in cloud
The secure inference is based on three independent servers denoted as
Privacy-preserving protocols
In this section, four privacy-preserving protocols are introduced as base protocols:
Base protocols
Hadamard product
This paper designs two protocols for securely evaluating element-wise matrix multiplication. The difference lies in the inputs of protocol. In
Comparison of Π
HP1
and Π
HP2
Comparison of
Sigmoid activation function is one of the non-linear functions in the GRU layer. Both update gate and reset gate apply Sigmoid activation function to determine how much the previous data can be kept or discarded. It is a non-linear function that transforms input values into the interval of 0 and 1.
1:
2:
3:
4:
1:
2:
3:
1:
2:
3:
4:
5:
6:
7:
8: Finally,
Tanh activation function
The computation flow of Tanh activation function is similar to Sigmoid activation function. The difference is that Tanh activation function transforms the values into the interval of -1 and 1 instead of 0 and 1.
1:
2:
3:
4:
5:
6:
7:
8: Finally,
Principal protocols
As the base protocols for GRU are defined, it is convenient to construct principal secure inference protocols in GRU based on these protocols. Protocol 5 to Protocol 8 design secure inference within one unit of GRU and the GRU layer is composed of a sequence of units.
Update gate and reset gate
The privacy preserving protocols of update gate and reset gate are demonstrated in Protocol 5 and Protocol 6 respectively. Both Protocol 5 and Protocol 6 are (
1:
2:
3:
4:
1:
2:
3:
4:
Current memory
Protocol 7 is a
1:
2:
3:
4:
5:
Activation of current unit
Protocol 8 is a
1:
2:
3:
Putting it all together
In a unit of GRU,
Security analysis
This paper uses UC framework [6, 7] to prove the security of privacy-preserving protocols, which should conform to Definition 3.3 and Definition 3.3. UC is the strictest simulation-based proof that allows any modular composition and concurrent execution. As PrivGRU is a framework that defines protocols for atomic operations, the real-world implementation may vary by composing protocols in a different manner. With universal composition theorem (Definition 3.3), the privacy-preserving protocols can be arbitrarily composed and still remain UC-secure. This paper defines security for two requirements: correctness and privacy [24]. The correctness is proved by comparing the protocol reconstruction result with the output of corresponding ideal functionality. The privacy is confirmed by indistinguishability from the environment view between the real-world execution and the ideal-world execution.
Security model
In this work, honest-but-curious model (semi-honest adversary) is adopted to simulate the corruption in the UC framework [6, 18]. Especially,
Security of base protocols
(
(
Second, the privacy of
(
(
Security of principal protocols
Consider the proof of privacy, the part of subroutines are perfectly simulated using Definition 3.3 and there is no other transmitted messages. If
(
(
(
(
Security of privGRU
PrivGRU model is composed of base protocols and principal protocols. The architecture of PrivGRU needs not to be fixed. As the security of each individual protocol has been proved UC-secure, the implementation of network can differ by arbitrarily combining these protocols. The security of PrivGRU can be inferred using Theorem 3.3. This security guarantee provides higher practical value of this work.
Conclusions and future works
This paper presents the PrivGRU, a framework of privacy-preserving GRU that can be deployed in cloud. The security of each protocol has been proved in UC framework; therefore, PrivGRU can retain the privacy of both model owners and prediction clients by composing the UC-secure protocols arbitrarily. This flexibility makes PrivGRU useful for many practical scenarios while nowadays people value thier privacy increasingly. In future works, we will extend the PrivGRU with secure training protocols and conduct experiments as the proof-of-work. [28, 37].
Footnotes
Acknowledgment
This research was supported by the Ministry of Science and Technology, Taiwan (ROC), under Project Numbers MOST 108-2218-E-004-001-, and by Taiwan Information Security Center at National Sun Yat-sen University (TWISC@NSYSU).
