Abstract
Cloud computing has become ubiquitous, offers an economical solution for convenient on-demand access to computing resources, which enable the resource-constrained clients to execute extensive computation. However, outsourcing of data and computation to the cloud server is a great cause of concern, such as confidentiality of input/output and verifiability of the result. This paper addresses the problem of designing outsourcing algorithm for linear regression analysis (LR), which is an important data analysis technique and widely applied across multiple domains. The outsourcing framework illustrated by the following scenario: a client is having a large dataset and needs to perform regression analysis, but unable to process due to lack of computing resources. Therefore, the client outsources the computation to the cloud server. In the proposed LR outsourcing algorithm, the client outsources LR problem to the cloud server without revealing to them either the input dataset and the output. The algorithm is a non-interactive solution to the client, it sends only input and receives output along with the proof of verification from the cloud server. The client in the proposed algorithm able to verify the correctness of result with an optimal probability. The analytical analysis shows that the algorithm is successfully meeting the challenges of correctness, security, verifiability, and efficiency. The experimental evaluation validates the proposed algorithm. The result analysis shows that the algorithm is highly efficient and endorses the practical usability of the algorithm.
Introduction
The rise in the popularity of cloud computing and mobile devices such as smartphones, notebooks, and other handheld battery operated devices provide an ideal situation where a weak device outsources its computation to more powerful cloud servers. This ideal condition is a strong motivation for researchers to give outsourcing solution for more real world applications. These resource-constrained devices have strong desire to execute computationally intensive task; that makes the outsourcing of computation to the cloud server a promising solution. The outsourcing paradigm enables resource-constrained clients to execute large computation task by offloading their computation load to massive cloud servers. Due to easy availability of cloud servers, the clients are no longer restricted to their limited CPU, storage, and bandwidth; else they are leveraging the abundance of computing resources due to seamless access to the cloud servers [1]. Moreover, outsourcing provides significant economic benefit to the clients. It brings down the capital and operational cost [2, 3]. Suppose somebody wants to analyze seismic, astronomic, or financial data. These datasets are often gigantic and involve large computations. Maintaining data center for computation requires critical human resources and significant capital investment [4]. Therefore, owning a data center incurs significant capital and operation cost to the client [5]. Hence, outsourcing the computation to third party cloud server is a beneficial option for a client [6, 7]. Despite the tremendous advantage, this promising paradigm brings many security and privacy concerns such as confidentiality of data (input and output) and integrity of result which makes a client reluctant to outsourced its computation on the cloud server. The client’s data is confidential in nature. It may contain information of personal, medical, financial, trends of stock, scientific research records, etc. Therefore, the data needed to be encrypted to maintaining the confidentiality and integrity before being outsourced. One way to address this security concern is to apply an encryption scheme, but the tradition encryption scheme would not work out since performing meaningful computation on cipher is very difficult. The second concern is correctness of result because the operation performed inside the cloud is non-transparent. A client cannot trust the result computed by the cloud server. The cloud may return wrong result due to a flaw, bug in the logic or may intentionally deviate from the algorithm instructions (malicious cloud). Therefore, there is no guaranty on the integrity of the result. Therefore, an outsourcing algorithm must be proficient in providing privacy to confidential data (input and output) and verifying the correctness of the result. Other important challenges are correctness and efficiency of the outsourcing algorithm. The client system needs to perform some pre-processing such as the transformation operation for preserving the input and output privacy, verification to check its correctness and finally the retransformation to obtain the final result. Therefore, the computation performs on the client-side system (viz., transformation, verification, and retransformation) must be substantially lesser. Otherwise, the outsourcing is not beneficial [8–10]. Thus an outsourcing algorithm must satisfy the following four design goals: correctness, security, verifiability, and efficiency [11].
Linear regression analysis is an important data analysis technique, which has application in many domains such as financial modeling (for the prediction of stock price based on previous rates) [12, 13], weather prediction (prediction of weather to take correct decision of on the type of crops to maximize the yield) [14], machine learning (for training and prediction of dataset) [15], sensor node deployment (help to analyses the correct positions of sensor nodes) [16], traffic management [17], to just list a few. Thus, a variety of customers from many domains needs a solution for LR problem. Moreover, when the client is resource-constrained and LR deals with a large dataset, a useful option is to outsource the LR problem to the cloud server. Even if the dataset is on moderate scale, for the resource-constrained clients such as mobile phones, laptops, portable hand-held devices, performing such significant execution is a huge job. Therefore, outsourcing of LR is a convenientoption.
A typical regression model is shown as the linear function Y = Xβ, where (X) is the observed variable, (Y) is the dependent variable and (β) is the regression parameter. The value of the regression parameter (β) makes the linear function to best fit for the dataset D (X : Y). However, computing the value of (β) is a computation intensive problem since the procedure requires to perform matrix multiplication and inversion operation, that has O (n3) complexity and when LR is applied on a large data set memory is another bottleneck. The wide applicability in various domains of LR problem, its computation complexity and the inability of resource-constrained clients to execute such problem motivates us to design privacy-preserving and efficient outsourcing algorithm in a malicious cloud environment. The paper offers a solution for the linear regression problem via a completely new approach, the client outsources the complex computation of LR to the cloud server, while itself maintaining the privacy of input/output, result verifiability. The main idea of this algorithm is to apply an efficient linear transformation technique on the input dataset. The transformation must able to preserve the input/output privacy and also allows the cloud server to perform meaningful operation. In this way the cloud server is entirely unaware of “what exactly is computed” on its hardware. Further, the result verification step should not involve any complex operation. The proposed algorithm ensures the correctness of result. The verification phase is very efficient and requires only a matrix-vector multiplication. The verification phase enables the client to detect server misconduct with an optimal probability. Finally, the client performs retransformation to obtain the original solution ofLR problem.
The contribution of this paper summarize as follows: Privacy preserving transformation technique has been introduced, which provide security to the input/output dataset. It is not revealing any sensitive information to the cloud server. It allows the cloud to perform an operation (regression analysis) on the transformed dataset. It is also suitable for any permitted dimension of (X(m*n), Y(m)). The client in the proposed algorithm verifies the correctness of result with an optimal probability of 1. The client is saving expensive computation of result retransformation if the result obtained from cloud server fails the verification test. The client able to verify encrypted result. Further, the verification algorithm is efficient and not involves in any complex computation. The paper discusses all nuances and the methods in designing of the outsourcing algorithm. The analytical and experimental performance demonstrates the practical usability of the proposed LR outsourcing algorithm in a malicious cloud environment.
The rest of the paper is organized as follows: Section 2 presents the related work done in the area of secure computation. Section 3 provides a detail discussion of the regression analysis outsourcing problem, its mathematical elements, and system model to solve them. Section 4 carries out the discussion of proposed regression analysis outsourcing algorithm and all nuances, and the methods require in designing outsourcing algorithm. The analytical analysis of the proposed regression analysis outsourcing algorithm on correctness, security, verifiability and efficiency parameter presented in Section 5. Section 6 presents the experimental analysis of the proposed regression analysis outsourcing algorithm. Finally, concluding remark and future direction incorporated in Section 7.
Related work
In literature, there are various algorithms available for secure outsourcing of various core problems of linear algebra. These secure outsourcing algorithms are divided into two parts, which are semi-trusted computing model and untrusted computing model. A detail classification of secure outsourcing algorithms is presented Fig. 1. In the semi-trusted environment, the cloud server follows the algorithm instructions and produces the correct result, but the cloud server secretly records all information, which it has access and attempt to retrieve secret information. In the semi-trusted environment, the first one is audit based [18–20], the client or the trusted workers in audit-based approach recomputes some part of computation done by the untrusted workers. This approach is infeasible for a computationally weak client because if the client is capable of performing such computation, then there is no need to take help from the cloud server. This method also requires that some workers must be honest or at least non-colluding in nature. The second one is secure-co-processors [21–23], or Trusted Platform Modules (TPMs) method, which requires the deployment of trusted hardware on the server to provide an isolated execution environment. The last one is multiparty computation; the computation is being divided among two or more workers without allowing any participated worker to view another individual’s private data. The resultant of computation is the union of the output of allworkers.

Taxonomy of secure computation.
The server can deviate from the algorithm instructions and behave arbitrarily in untrusted model (malicious), the solution to this problem is to verify the correctness of result in addition to the secure computation. The verifiable outsourcing is of two types that are interactive proof and non-interactive proof. In interactive proof, a weak verifier actively challenges a server (prover). The prover replies a probabilistic proof to the client (verifier) to convince him (verifier) of the truth of the statement that he is unable to compute [24–26]. The last one is the non-interactive approach, where a weak client outsources the computation to a powerful server, the server returns the result along with the proof of verification [9, 27–37].
The proposed solution for LR applies non-interactive proof to provide verifiability of computation. Therefore, the attention is more on non-interactive verifiable algorithms. In the same direction, Gentry’s provides a novel Fully Homomorphic Encryption (FHE) scheme [38]. This scheme allows the server to perform arbitrary computation on encrypted input, but the solution suffers from complexity and inefficiency that makes it far from practical use and it does not guarantee that the server performs correct computation [10, 39]. Gennaro et al. [27] have proposed non-interactive proof based solutions for secure outsourcing of polynomial function. First, they model the polynomial function into Yao’s grabbled circuit [40, 41]. Then encrypt this circuit homomorphically [42] and send it to the cloud server for the execution of the polynomial function. The cloud server carries out the execution & returned a computationally sound non-interactive proof that can verify in O (m) time. Another elegant solution for secure outsourcing proposed by Chung et al. [43]. The basic idea of this algorithm is that the client need to carryout pre-processing and creates hundreds of problem instances mostly of same type. Then apply homomorphic encryption for privacy. The cloud server computes these functions without knowing the actual inputs. Finally, the client verifies the solution of same type of problem for ensuring their correctness. The main drawback of these two algorithm is that they incur huge computation load on both client and cloud server due to complex homomorphic encryption. These algorithms are also required to modeling the problem into the circuits, which needs to deal with a large number of parameters. However, the main advantage of these methods is, that they need constant time for verification. Unlike the generalized approach to solving the problem, there are many algorithms available which address specific problems. Lei et al., [33, 44] presents matrix multiplication and matrix inversion outsourcing algorithm. They applied monomial matrix based transformation method for preserving the privacy of input/output matrix in both the outsourcing algorithm. These algorithms efficiently outsource the matrix multiplication and matrix inversion to the cloud server, while maintaining the correctness, privacy of the input/output, and verifies the result efficiently. An algorithm address linear programming (LP) [45]. This method split the LP solver into two parts; those are the LP solver on cloud and the LP parameters on client-side. The client transforms the input data using a transformation technique and outsources to the LP solver on a cloud server. Then cloud server solves the linear programming problem with the help of LP solver and returns the result. The client verifies the result using the fundamental theorem of duality. Further, an algorithm is also presented for LP problem, which reduces the computation and communication time significantly [9]. In [46], Blanton et al. addressed the large scale computation of biometric computation. The implementation leverages individual structures of distance computation and random sampling. The result verification method can verify the result with modest overhead and high probability.There are many algorithms published recently for the “system of linear equation” [9, 47]. The algorithm proposed by [32] requires (n) round of communication between the client and cloud before the client get the convergent solution. The algorithm encrypts the input in O (n2) in setup phase and verify the output in O (n2). The (n) round of communication between the client and the cloud is the main drawback of this method, which increases the communication overhead and add security vulnerability. Next [9], an algorithm identified a new solution for linear equation which employs some special linear transformations using diagonal matrices. This algorithm requires one round of communication between the client and the cloud. Further, an algorithm [11] utilizes the sparse matrix to propose a new secure outsourcing algorithm of large-scale linear equations in the fully malicious model. The algorithm requires only O (1) round of communication between the client and cloud server. In [34] Chen et al., proposes an outsourcing algorithm for modular exponent in malicious cloud model. Further, they use this algorithm as a sub-routine for outsourcing Carmer-Shoup encryption and Schnorr signature. They further extend the algorithm for outsourcing of simultaneous modular exponent. Recently, in [35] Lei et al. address determinant computation using block matrix and monomial based transformation operation for a malicious cloud environment. The outsourcing algorithm is an efficient implementation of determinant computation which reduces the complexity of computing determinant from O (n2.373) to O (n2). Our work is an extension of these algorithm that address an important data analysis problem regression analysis via a completely new transformationoperation.
System model
The system model for secure computation outsourcing algorithm for regression analysis is shown in Fig. 2. The client wants to analyze large data set to make a regression model, but due to lack of computing resources, the client is unable to perform such analysis. Therefore, the client outsources the computation to the cloud servers. The client first performs a linear transformation on the input problem φ (X, Y) using a secret key (k), the operation transform the problem into φk (X′, Y′). Then, the transformed regression analysis problem outsourced to the cloud server. The cloud servers are expected to compute correct result for the outsourced regression problem (φ k ). However, for the assurance of correctness of the result, the client verifies the result received from the cloud server. The client first verifies the encrypted result received from the cloud server. If the client found that the result received from the cloud server is correct, then only the client performs a retransformation to obtain decrypted result.

System model for secure outsourcing of linear regression.
The security threat in secure outsourcing system model comes from the suspicious behavior of the cloud server. The previous work for the secure outsourcing computation defines three types of threat model that are “Trusted Model,” “Semi-Trusted Model”, and “Untrusted Model” [11, 44].
Outsourcing algorithm framework
A secure and verifiable outsourcing algorithm framework has five sub-steps in the following order: KeyGen, ProbTrans, Compute, Verify, and Retransform. Table 1 presents the notations and symbols used in the paper. KeyGen (1
λ
): The algorithm takes input security parameter λ and generates the key for transformation operation. The key generation step runs for every new problem submission. ProbTrans (φ, k): The client first encrypts the input problem φ (X, Y) with key (k) and generates the transformed problem φ
k
(X′, Y′). Computeφ
k
(X′, Y′): The client outsources the transformed problem φ
k
(X′, Y′)to the cloud server. Then the cloud server carries out the computation for the regression parameter (β′) on the transformed data set. Verify(β′, k): The client verifies the encrypted regression parameter (β′) obtained from the cloud server. Retransform(β′, k): The client retransform/decrypts regression parameter (β′) (if verification step is successfully passed), to obtain the regression parameter (β)
List of common notations and symbols
List of common notations and symbols
Regression analysis is an important data analysis technique. It has applications in modeling and analysis of numerical dataset across many domains [49]. A regression model estimates the values of dependent variable (Y) on the set of observed values (X). A linear relation exists between the observed variable and the estimated variable i.e.
The basic idea of regression analysis outsourcing algorithm is as follows: Let the resource-constrained client needs to perform regression analysis on large dataset D (X : Y), but due to lack of computing resources the client is unable to perform this computation. Therefore, the client outsources the regression analysis problem to the third party cloud server to perform the computation on its behalf, but before outsourcing the LR problem, the client first applies a transformation on the input dataset D (X : Y) to provide privacy. Then the client outsources the transformed dataset D (X′ : Y′) to the cloud server. Thus, for the cloud server, this transformed dataset is same as any other dataset. However, the cloud server could read this dataset but could not recover the original values. In this way, the privacy preserving transformation operation protect the input dataset from the cloud server.
Our goal is to find a new transformation technique for regression analysis outsourcing algorithm. Therefore, an investigation has been performed on matrix property to find a candidate solution. Finally, the research has reached on a conclusion that the inverse matrix may be a possible solution. However, matrix inversion is a complex operation, which makes the transformation process inefficient and hence the outsourcing algorithm. Therefore, the research further proceeds for an efficient solution; the finding is a generalized permutation matrix. A square matrix is generalized permutation matrix if it has only one non-zero entry in each row or column. This particular arrangement of the matrix makes the transformation operation very efficient. Further, the next section presents the discussion of the development of general permutation matrix and the transformationoperation.
This section starts with a discussion on the generation of permutation matrix. The permutation matrix is used as key for the privacy-preserving (transformation operation) method. The analysis of the least square method gives insight and motivation for the design of privacy preserving transformation method for linear regression analysis.
Creation of generalized permutation matrix
This discussion helps us to generate security key that will further use in the transformation operation. The privacy preserving transformation method for linear regression analysis uses generalized permutation matrix. This matrix has one element in each row & column. It takes the only linear time to the size of input for a generation. The development of permutation matrix is inspired by [50]. Cauchy’s permutation function [51] generates random permutations. The permuted values are used to shuffle the matrix values row-wise and column-wise. The Cauchy’s Permutation function can be written as
When Cauchy’s permutation function applies on the row of an identity matrix, it randomizes the indices positions of the rows. The identity matrix with permutated row further masked with positive random numbers γ → {γ1, γ2, … … … γ n } here 0 ∉ γ. In this way we find the final key matrix .
This key matrix has some unique qualities, it is square and non-singular matrix, thus invertible and a matrix multiplication with this matrix and a general matrix cost only O (n2), since the cost of addition get omitted, for simplicity we present the key matrix as Inv1, Inv2 and their inversions as in the paper.
Let, the key matrices are and . We can find their inversion in O (n) time. The inversion are and .
The permuted matrix when multiplied from left-side it performs row-wise permutation, and when it multiplied from right-side, it performs column-wise permutation. Further, each value of the transformed matrix is masked by a multiplicative factor of γ i /δ i .
The final transformed matrix of input X could be written as,
The privacy preserving transformation operation has three objectives. Firstly, it should able to provide privacy to the input and output. Secondly, it allows meaningful computation on the encrypted data. Third, the client in the algorithm could efficiently retransform the output β′ to β without any complexity.
The least square method for the calculation of regression parameter β = (X T X) -1X T Y provide us insight to transform (X) into (X′). When the dateset (X : Y) has transformed into and the . Putting the values of (X′, Y′) in Equation (2).
The transformed regression parameter
This method provides privacy to the input data D (X : Y). But there are too many extra terms in Equation (5), which make difficult to establish a meaningful relationship between (β′) and (β),without a relation between (β) and (β′), we could not perform retransformation to recover the original values (β). Thus, this transformation method incapable to meet the third objective.
Moreover, to reduce the additional term, it is needed to relook on the operation to find a solution to reduce the additional terms. When closely observed, it has concluded that if the transformation is applied in the following order, i.e.,
First, transform the (X) into then transform (X T ) into and finally (Y) into Y′ = Inv1Y, and put these values into the Equation (2), the resulting equation,
Since, the invertible matrix the value of regression parameter will reduce to
Further, the property of invertible matrix that it reverses the order of multiplication, i.e.,
Simplifying the terms, the Equation (8) will further reduce to
There is a visible relationship exists between (β′) & (β), which could efficiently perform retransformation to obtain the original solution of regression parameter (β).
Five sub-algorithms has been developed for the proposed LR outsourcing algorithm, that are KeyGen, ProbTrans, Compute, Verify, and Retransform as follows:
a.
b.
c.
d.
e.
Further, the flow of working on LR outsourcing algorithm presented in Fig. 3. The algorithm has following phases (KeyGen, ProbTrans, Compute, Verify, and Retransform).

Flowchart of LR outsourcing algorithm.
Correctness analysis
The proposed algorithm performs correctly, only if the client and the cloud server follows the algorithm instructions and computed the result correctly. Equation (9) provides the value of (β′). A relation can be easily establish between the regression parameter (β) and the transformed regression parameter (β′).
β′ = Inv2β, so, the value of regression parameter (β) could be computed using
Thus, the proposed outsourcing algorithm is correct, and retransformation using Equation (10) could be performed to obtain the regression parameter (β) from (β′).
The original input dataset D (X : Y) has been transformed to D (X′, Y′) with the help of ProbTrans algorithm. The proposed algorithm is able to provide privacy to the input/output pair only if the cloud server could not recover the original information.
Step 1: First, each entry in original matrix is randomly permuted row-wise & then column-wise using permutation function {π1, π2} whereπ
i
∈ {1, … … … , n} and i∈ { 1, 2 },
Step 2: Each entry in matrix (X, Y) is masked by multiplicative factor
In step 1, each entry is randomly permuted by two permutation functions {π1, π2}, so there are (m) ! * (n) ! cases for each entry in the matrix where each case occur with the probability of 1/(mn) !. The estimated time of brute-force attack on the key space to recover the original matrix (X) is (mn) !/2, which is a non-polynomial bound quantity in terms of (m, n).
In step 2, the matrix produced in step (1), X′ (i, j) is further multiplied by the multiplying factor γ→ { γ1, γ2, …… … γ m }, δ → {δ1, δ2, …… … δ n }, where {0∉ γ ∪ δ }. The approximate time for brute-force to recover the key space (γ & δ) is (|K γ | m * |K δ | n )/2. A large choice of (γ & δ) will sufficiently reduce the chance of guessing (γ & δ). Therefore, the total complexity of this method is |K γ |2m * |K δ | n * (m !) 2n ! .
Further, the proposed algorithm also protects the output result in the same way as input. The cloud server could not recover (β) from (β′). Moreover, the client generates a pair of new securities keys (Inv1 and Inv2) for every new problem submission. Therefore, the proposed encryption system is similar to the one-time-pad encryption system. Therefore, there is no chance of known-plain-text attack or chosen-plain-text attack. Thus, the proposed outsourcing algorithm can provide security to client information and not leaking any information to the cloud server.
In the malicious threat model, the cloud server may deviate from the actual algorithm instructions and return random arbitrary results. Thus, the proposed algorithm must be equipped with the rigorous result verification process, which verifies the correctness of the result. The client receives the regression parameter (β′) from the cloud server. The client already have the transformed matrices (X′, Y′). If the regression parameter (β′) is correctly computed by the cloud server then only the equation (Y′ = X′ * β′) holds true. If the equation (Y′ = X′ * β′) holds truethen,
Further, a mathematical discussion has been presented to prove that a false result never passes the verification test and the client always detect the server misconduct with an optimal probability of one.
The value of could be any real number, but if the server correctly computes the values of (β′), there should be only one value which satisfies Equation (16). So, has only one value, so the probability (Pr = 1/n) where so Pr ≅ 0. So, the server misconduct detected with the probability of (Pr = 1 -1/n) ≅ 1. Thus, the result verification step verifies the result with the probability of (1).
The outsourcing algorithm for regression analysis has two parts, the client-side computation, and the cloud-side computation. The client performs the following sub-operations: the KeyGen, ProbTrans, Verify, and Retransform. However, cloud server runs the complex computation of regression parameter on the transformed input Computeφ k (X′, Y′).
Theoretical performance analysis of regression analysis outsourcing algorithm
Theoretical performance analysis of regression analysis outsourcing algorithm
The experimental analysis of the proposed algorithm is based on the mathematical and theoretical analysis, which has been discussed in the previous sections. The regression analysis outsourcing algorithm is developed using Matlab language version 2014a. The client & the cloud both implemented on a system having same computing capacity. The CPU is Intel® Xenon® CPU E5-2620 v2 @ 2.10 GHz ∼ 8 GB RAM. The reason for implementing client and server on the same node is to show the outsourcing algorithm actual efficiency and performance gain. If there are differences between the computing performance of client and server, then the performance of algorithm will be case specific. However, in reality, the cloud server always has more computing resources than the client does.
To, measure the performance of the algorithm the experimental analysis uses three standard parameters the efficiency, performance gain of the client and the relative extra cost incur by the outsourcing paradigm.
The second parameter is the Performance Gain (PG) for the client “It represents the actual speed-up the client has gained from outsourcing the problem.”
REC: The relative extra cost (REC) is define as the amount of extra work done by the client and cloud server in outsourcing paradigm as compared to direct method. Ideally, REC should be near to zero that is there should be no extra burden incur by the outsourcing paradigm.
Further, the notations use to calculate the performance of the algorithm presented in Table 3.
Terms & Notations
This section presents the experiment analysis of the proposed linear regression analysis outsourcing algorithm. At the very first, the client generates random instances of input to perform experiments on LR algorithm. Then, the client initiates the LR outsourcing algorithm. In the first stage, the client generates a new secret key, after that the client transforms the input using ProbTrans (φ, k) algorithm. This operation transformed the input matrices to D (X′ : Y′).
Computation of problem transformation
Problem transformation cost dominates the client-side computation. This sub-algorithm requires to perform matrix multiplication, but this operation is efficient as compared to executing the regression analysis computation on client system (see details in section 4.2) since the asymptotic complexity of ProbTrans (φ, k) is only O (mn). Table 4, presents the time required to perform problem transformation for different size problem. The size of the problem varies from m * n = 1000 * 500 to 10000 * 4500, and the time required to carry out the transformation of largest size problem in our experiment is 2.672 seconds on the client’s system. This amount of time is much less than performing the LR analysis, the time for LR analysis of execution shown in Table 5.
Client side computation cost of proposed LR outsourcing algorithm
Client side computation cost of proposed LR outsourcing algorithm
Performance analysis of proposed LR outsourcing algorithm
The cloud server performs the computation of regression parameter on transformed input D (X′ : Y′). The cloud server executes the least square method on the transformed input in Equation (6). The cloud server needs to perform some matrix multiplication operations but the elegance (the time complexity of multiplication operation is in the order of O (mn)) of the proposed method makes the cloud-side complexity same as the direct method. The upper bound time complexity of (least square method) direct method for linear regression computation and in the proposed LR outsourcing algorithm is same. As shown in Fig. 4, most of the work is accomplish by the cloud server and very little work is perform by the client for LR outsourcing in order to perform the operations of KeyGen, ProbTrans, Verify, and Retransform.

Execution time comparison between client and cloud.
Once the cloud server finishes the execution of regression analysis problem, it returns the results to the client. The client first verifies the correctness of values of the result, as the procedure discussed in Section 4.3 (Algorithms 5, 6) & Section 5.3. For the correctness of result, the equation (Y′ = X′ * β′) must hold true. The computation cost of verification process is dominated by a matrix vector multiplication, which cost only O (mn + m). Finally, if the result passes the verification process, then only the outsourcing algorithm reaches to retransformation. For retransformation operation the client will carry out . This operation is very efficient, cost only O (n) (linear time) because there are only n elements in and β′ is a column vector of n row. This matrix-vector multiplication requires only the multiplication of n elements.
The proposed algorithm has executed multiple times for each problem instance to get stable system performance. The main performance analysis is presented in Table 5. It has shown in Fig. 5 as the dimension of the problem increases the performance gain also increases. The performance gain is in double-digit for larger LR problem and is able to attain more than 31.98 times client speed-up, which is a very motivating factor to use linear regression outsourcing algorithm in real-world scenarios.

Client performance gain.
It has shown in Table 5, that the efficiency parameter remains close to one, that means outsourcing paradigm add no extra burden on the cloud server for executing of an encrypted problem.
Finally, REC is monotonically decreasing as the size of problem increases that means the combined extra work done by the cloud and client reduces as the problem size increases as presented in Fig. 6. It is shown in Table 5 that the REC parameter stays 0.1624 for (1000*500) problem size whereas 0.0208 for (10000*5000) which 87.19% decrease in relative extra cost.

Relative extra cost.
Interestingly, the experimental performance depends on problem dimension and the underlying execution platform. If the cloud exploits other faster matrix operation algorithms, then client speedup will decrease to some extent. However, as long as the size of the input goes sufficiently large, the client get a performance gain due to the apparent computation gap of O (n) between the client-side computation and server-side computation.
This work presents a privacy preserving, verifiable and efficient outsourcing algorithm for linear regression analysis in a malicious cloud environment. It has shown in the analytical analysis that the proposed algorithm is meeting the design goal of input/output privacy, correctness, verifiability, and efficiency. The proposed LR algorithm requires a one-time amortizable setup phase (KeyGen + Transformation) with O (4mn + m) cost, then each following algorithm such as verification only cost O (mn + m) & Retransformation cost O (n). However, the time complexity for execution of regression analysis problem cost O (n2m + n3 + mn + n2). Therefore, the client has the computational saving of O (n) than the direct algorithm for LR analysis. A study has also conducted in the paper, which investigated the least square method and the algebraic property of the matrix-vector multiplication for the development of an efficient result verification algorithm which verify the correctness of result with an optimal probability. Further, it has shown in the experimental analysis that the proposed algorithm able to achieve an astonishing 31.98 times client speed in the experiment. The efficiency parameter remains close to one which means outsourcing paradigm add no extra burden on the cloud server for executing an encrypted problem. Finally, REC is monotonically decreasing as the size of problem increases that means the combined extra work done by the cloud and client reduces as the problem size increases. Through analytical and experimental performance demonstrated the practical usability of the proposed LR outsourcing algorithm in a malicious cloud environment. In future, it would be remarkable to find newer computationally expensive mathematical, scientific and engineering problems and then designing outsourcing algorithms to solve them.
