Abstract
Adding noise to user history data helps to protect user privacy in recommendation systems but affects the recommendation performance. To solve this problem, a matrix factorization tourism point of interest recommendation model based on interest offset and differential privacy is proposed in this paper. The recommendation performance of the model is improved by analyzing user interest preferences. Specifically, user interest offsets are extracted from user tags and user ratings under time-series factors to calculate user interest drift. Then, similar neighbors are found to train user feature preferences which are integrated into the matrix model in the form of regular terms. Meanwhile, based on the differential privacy theory, a privacy neighbor selection algorithm combining the K-Medoides clustering algorithm and index mechanism is designed to effectively protect the identity of neighbors and prevent KNN attacks. Besides, the Laplace mechanism is used to implement differential privacy protection for the model’s gradient descent process. Finally, the feasibility of the proposed recommendation model is verified through experiments, and the experimental results indicate that this model has advantages in recommendation accuracy and privacy protection.
Introduction
With the rise of social networks and travel web-sites, tourists are often bored with a large number of information searches and product choices. A tourism recommendation system can solve the problem of information overload effectively [1]. The recommendation system learns the interests and preferences of users through their historical browsing records. Then, it recommends items or item sets of potential interest to users and provides personalized services. However, the training of learning algorithms requires a large amount of data.
To provide users with more accurate services and make the recommendation services accepted by more users, the recommendation system needs to collect more user data, such as user browsing information, purchase information, rating information, and user attribute tags. However, these data include personal privacy that users do not want to disclose. Suppose a tourist is now traveling in Xi’an. First, the tourist booked a hotel around the spot through a group-buying website, the website platform gets tourist’s location and then recommends a restaurant to the user according to his consumption level and food preference, if the user chooses Sichuan cuisine dishes, the merchant and the platform can infer where the tourist comes from based on his consumption. If the data is leaked to a third party, it is straightforward to cause the privacy of the user’s preferences to be leaked. The defect of recommendation system without privacy protection can be used by attackers to accurately obtain users’ real privacy information, which brings hidden danger to users’ reputation and even life and property safety. It is contrary to the goal of developing recommendation system to bring users a better experience and is not conducive to the development of recommendation system. Therefore, it is essential for personalized travel recommendation algorithm research that user data remains secure and private while users enjoy personalized travel recommendation content.
At present, privacy-preserving methods for travel recommendation systems are mainly classified into three types: generalization, data perturbation, and encryption. The literatures [2, 3] use the k-anonymity method to hide user information with k similar users by generalizing the user’s information. However, k-anonymity does not strictly define the background knowledge held by the attacker. Thus, there is the problem of low security when facing new attacks. The literatures [4, 5] proposed data scrambling methods to protect user data by interfering with their historical data to a certain extent before sending them to the recommendation server. Although the data scrambling method is simple, it also suffers from the problem of insufficient protection capability. The literatures [6, 7] use homomorphic encryption methods to calculate the similarity of nearest neighbors in the collaborative filtering recommendation process. However, the homomorphic encryption algorithm also suffers from high computational complexity and low recommendation efficiency when applied in large-scale datasets. The above privacy-preserving methods are all single for user preferences privacy. The recommendation methods that consider user preferences privacy protection have involved little research. Our previous work proposed a distributed user privacy-preserving recommendation framework [8], but the approach did not consider that users’ preferences are shifted over time.
Based on user interest offset and differential privacy, a matrix factorization tourism point-of-interest recommendation model is proposed in this paper to ensure recommendation performance without leaking user privacy data. The model extracts user interest points from user tags and user ratings under time-series factors to calculate user interest offset. Meanwhile, a private neighbor selection method combining the K-Medoides clustering algorithm and the exponential mechanism is designed, which implements exponential mechanism protection for neighbor identities. Besides, the Laplace mechanism is exploited to gradient the model. The descent process and random noise together increase the safety of the recommended model. The contributions of this paper are summarized as follows:
An Interest Drift Matrix Factorization (ITMF) recommendation model is proposed. The user interest points are extracted from user tags and user ratings under time-series factors to calculate user interest drift. Also, the neighbors with similar interests are found to train the user feature matrix, which is incorporated into the matrix factorization model in the form of regular terms. Based on the differential privacy and interest shift, a matrix factorization recommendation model (Differential Privacy Interest Drift Matrix Factorization, DPITMF) is proposed. This model considers the identity of neighbors, and the recommendation process requires differential privacy protection. Meanwhile, a private neighbor selection method combining K-Medoides clustering algorithm and the exponential mechanism is designed to effectively resist KNN attacks, and the Laplace mechanism is exploited to protect the gradient descent process of the recommendation model. The privacy security of the proposed model is proven through theory, and the efficiency of the recommendation system is verified through experiments.
The rest of this paper is organized as follows. In Section 2, we introduce existing related work on travel recommendation and privacy protection method. In Section 3, we introduce background on Matrix factorization recommendation model and Differential privacy. Recommendation system model construction and optimization is detailed in Section 4. The experimental setup and results are presented in Section 5. Finally, the conclusions are presented in Section 6.
Related work
Travel recommendation
As an effective tool to deal with the problem of information overload, the personalized recommendation system has received extensive research and attention Kofler et al. [9] collected travel photos shared on the Flikr platform and recommend a photo of tourist attractions related to the destination set by the user. Moreno et al. [10] designed a personalized travel destination recommendation system, which used collaborative filtering technology to recommend similar attractions to users. Loh et al. [11] searched the tourism ontology knowledge base for destinations or tourist attractions with high similarity to the user’s preferences and recommended them to users. Levi et al. [12] used content-based recommendation technology to recommend potential hotels of interest to users and user satisfaction is obtained through questionnaire surveys. Historical travel data is important to the recommendation model, but the user¡¯s interest points will change over time. A recommendation model designed with excessive reliance on historical travel data without considering the time factor cannot meet the needs of personalized recommendation.
Privacy protection method
In practical applications, service providers make high-quality recommendations by mining user information, but the lack of privacy protection methods will cause user privacy leakage [13, 14] in the recommendation system. Making accurate predictions while protecting user privacy and security is a research hotspot. At present, domestic and foreign researchers have proposed a variety of solutions to the privacy protection problem in the recommendation system. The common privacy protection technologies include anonymization technology, encryption technology, and data disturbance technology. The anonymization technology protects user privacy by collecting user data and performing anonymization operations. Improper data anonymization will result in poor data practicability. The encryption technology is an encryption method based on the cryptographic algorithm. The method has very high security but suffers from a high cost of communication and calculation overhead. The data perturbation technology generally obfuscates the real data by add-ing random noise to the original distribution of the data. Based on the principle of data disturbance, many methods have realized privacy protection, where the random disturbance method and the differential privacy method are the most widely known. The latter is adopted by this paper for privacy protection. The concept of differential privacy was first proposed by Dwork et al. [15–17], and it is continuously improved. Under differential privacy protection, the attacker cannot infer real data even if he has the largest background knowledge. Differential privacy was first applied to recommendation systems by McSherry et al. [18]. The combination of the Laplace mechanism with collaborative filtering recommendation can effectively reduce the large recommendation performance loss caused by differential privacy. Zhu et al. [19] proposed to apply the differential privacy index mechanism to user-based collaborative filtering to strictly protect the identity of user neighbors. Berlioz et al. [20] proposed to apply differential privacy to the matrix factorization recommendation system to protect the entire process of matrix factorization. Xian et al. [21] proposed a collaborative filtering algorithm (DPSS++) based on differential privacy and SVD++. The three differential privacy mechanisms were designed in term of gradient perturbation, objective function perturbation, and output result perturbation. Based on this, the prediction accuracy is improved without leakage of user privacy. Yang et al. [22] proposed a framework that includes two perturbation methods to prevent the threat of inference attacks against users. All of the above privacy-preserving methods are only for user privacy. In contrast, the recommended methods for privacy preservation that take into account both user privacy and user preferences currently involve little research.
Preliminaries
Matrix factorization recommendation model
As for the collaborative filtering recommendation, the user-item data and the user’s item scoring data are stored in the user-item scoring matrix. It is assumed that m represents the number of users, and n represents the number of items; the user-item rating matrix R is a m * n matrix set that stores the ratings of all users for all items, and its mathematical expression is R = {r
ui
|1 ≤ u ≤ m, 1 ≤ i ≤ n}. The user project score matrix is defined as follows:
Because the dimension of the user-item rating matrix R is large and the matrix is extremely sparse, matrix factorization technology is introduced to improve the processing efficiency [23]. Specifically, matrix factorization uses items that have not been rated by users to make predictions or fill the original matrix to alleviate the impact of data sparseness on the recommendation accuracy [24]. As a result, matrix technology can be used in collaborative filtering recommendations, and matrix factorization algorithms have received extensive research and attention. Since matrix factorization reduces the matrix dimensionality and decomposes the original matrix into two submatrices, it is characterized by latent factors associated with users and items, which is also called the latent factor model (LFM). Subsequently, other improved recommendation models were proposed, such as Singular Value Decomposition (SVD) [25] and Nonnegative Matrix Factorization (NMF) [26]. These model-based collaborative filtering recommendation algorithms have been used in practical applications, and the recommendation accuracy is significantly better than that of the traditional user-item collaborative filtering. The matrix factorization process is shown in Fig. 1.

Schematic diagram of matrix factorization.
In the matrix factorization model, the original matrix R can be approximate by the inner product of the user eigenvector p
u
with a size of m * k and the item eigenvector q
i
with a size of n * k obtained by decomposition. Thus, the predicted value of user u for item i can be directly expressed as R, where k represents the dimensionality of the two feature vectors. To enhance the learning efficiency of the feature vectors (p
u
and q
i
) and improve the accuracy of the final prediction, the minimum regularization objective function L is used to reduce the error. Meanwhile, a penalty term is added to the target letter L to avoid over-fitting during training. The definition of the objective function L is as follows:
The theoretical basis of differential privacy
In the differential privacy theory, the sensitivity can measure the biggest change caused by the query function, and it can be exploited to control the amount of noise added. Too much noise affects data availability, while too little noise results in low privacy protection. Usually, global sensitivity and local sensitivity are defined.
A small global sensitivity of the query function, e.g., the sensitivity of the counting function is 1, can effectively ensure data security. However, if the query function has a large global sensitivity, it is necessary to add enough noise to the output function to ensure privacy and security, which results in poor data availability. As a result, the concept of local sensitivity was proposed.
Since the local sensitivity is determined by the query function f and the specific data in the data set, the local sensitivity is usually smaller than the global sensitivity. The equation describes the relationship between global and local sensitivities:
The principle of differential privacy combination consists of sequence combination principle and parallel combination principle.
Recommendation system model construction and optimization
The user’s interest bias is calculated from the user’s label data and the score under the time factor. Based on this, the personalized recommendation of tourist attractions based on the bias of user interest is realized. Fig. 2 illustrates a recommendation framework for tourism points of interest based on interest offset and differential privacy.

Tourist interest point recommendation framework based on interest shift and differential privacy. Firstly, the user’s original travel dataset is privacy preserved by adding noise, then the travel interest point recommendation model is constructed, and finally the recommendation results are fed back to the user.
The user’s tag data serves as a mark for the user’s behavior and habits. The case that a user has many tags or he is tagged with the same tag many times reflects the actual interest of the current user. Meanwhile, when a recommendation cold start occurs, the recommendation result can be given from the user’s interest, thus alleviating the data-sparse problem to a certain extent. As for the prediction of the user’s choice, the user will be more likely to be affected by his interest preferences. Knowing the user’s interest in each tag in advance is helpful for the recommendation system to make predictions from the user’s interest point of view. Therefore, the interest value of user i for label j (Ist
ij
) can be constructed as follows:
Finally, the score similarity under time drift Tsim(u, v) is combined with the similarity of user interest Lsim(u, v), and α is the balance factor. When there are few items rated by users, it helps alleviate the cold start problem by digging out the potential points of interest from the user tag information to assist in the prediction. The formula for calculating the deviation of user interest is as follows:
When a user makes a decision, the user is usually influenced by his friends around him. The friend who can influence user decisions is referred to as the user’s neighbor. The user’s neighbors usually have similar interests to the user’s interests. Therefore, the interest preference of the neighbors can be exploited by the matrix factorization model to train the potential feature matrix of the user and the item, thus improving the prediction accuracy. Specifically, k users with the highest interest offset are taken as the neighbors of the target user. They are integrated into the objective function in the form of user regular terms. In this case, the user’s feature matrix is always affected by the user’s neighbors, and the feature matrix is close to that of the neighbors based on the interest offset. Based on interest shift, the objective function of the matrix factorization model is defined as:
The personalized matrix factorization model is solved following the stochastic gradient descent method. The partial derivative of the objective function is calculated. Then, the latent factor vector is updated iteratively until the objective function converges to find the final parameter value. The process of finding partial derivatives is as follows:
The differential privacy protection is divided into two parts to construct a centralized differential privacy model. One uses an exponential mechanism to protect the identity of user neighbors; The other uses the Laplace mechanism to add noise to the gradient descent process of the matrix factorization model. The privacy budget of the two parts will be allocated at a ratio of 1:1 so that the whole privacy protection scheme meets differential privacy. The design of the DPITMF privacy protection model is illustrated in Fig. 3.

Tourist interest point recommendation framework based on interest shift and differential privacy.
In the ITMF model, the interest preference of the user’s neighbor is similar to that of the user. If the attacker pretends to be the target user’s neighbor, he can use his false interest to infer the real interest of the user. To protect the security of the user’s neighbor identity, a private neighbor selection method based on the K-Medoide clustering algorithm is designed. It is necessary to ensure that the user has high-quality neighbors, and these neighbor identities should be strictly protected by differential privacy. The detailed process of selecting a private neighbor is shown in Fig. 4.

The process of private neighbor selection. First, K-Medoide clustering is performed on the original tourism data to obtain the set of potential neighbors of the user, and the set of neighbors is randomly selected among the set of potential neighbors, and then the set of privacy neighbors is determined, which is the final set of neighbors.
Algorithm 1. K-Medoide clustering
Input: user project matrix R, number of clusters k
Output: user clustered collection V1, V2, …, V k
1: Randomly select one user as the first initial cluster center C1
2: for i ← 2 to k
3: for each u ∈ U
4: Calculate the distance S1, S2, …, Si-1 from the cluster center C1, C2, …, Ci-1 to the user u
5: Smin ← min(S1, S2, …, Si-1)
6: end for
7: Sampling by probability and use Roulette Selection to choose next center C i
8:end for /* Complete the calculation of the initial cluster centers (C1, C2, …, C k )*/
9: for each u ∈ U
10: Calculate the distance (S1, S2, …, S k ) from user u to k cluster centers
11: Smin ← min(S1, S2, …, Si-1)
12: divide(u,Smin) // Divide user u into k clusters according to the shortest distance
13: end for;
14:update(C1, C2, …, C k ,C1′, C2′, …, C k ′) /*
Accumulate the distances from each user to other users in the same cluster, and use the user with the smallest sum of distances as the new cluster center */
15:if (compare(C1, C2, …, C k ,C1′, C2′, …, C k ′)) /* Compare whether the cluster centers have changed */
16: end loop
17:else:
18: repeat 9∼14;
19: return V1, V2, …, V k ;
Algorithm 2. Privacy neighbor selection algorithm
Input: target user u, number of clusters k, privacy budget ∈
Output: Potential neighbor set N
1: Execute Algorithm 1, find out the cluster of target user u according to the clustering result C u
2: if length (C u ) > = 5N
3: find the 5N closest to the user u from the C u as the potential neighbors
4: else
5: add all to potential neighbors from the C u , the rest from the closest cluster
6: divide(5,pn1, pn2, . . . , pn5) /*Randomly divide the set of potential neighbors into 5 small sets */
7: for i 2 to 5
8: M← enumerate(N/5,pn i ) /* Enumerate all the possibilities of size N/5 from pn i and store them in M */
9: calculate Hsim(u, v) /* Calculate according to formula (13) */
10: for each N/5 to M /* Calculate according to formula (21)*/
11: calculate p(N, L)
12: end for
13: N← random_sample(N/5,p(N, L)) /*Random sampling generates the final neighbor set N*/
14: end for
15: return N
Step 1: K-Medoide clustering. The K-Medoide clustering algorithm is used for data preprocessing. The specific clustering steps are described in Algorithm 1, and the interest preference degree Hsim(u, v) proposed in Section 4.2 is used as the distance in the algorithm.
Step 2: Generation of a set of potential neighbors. In Step 1, the K-Medoide is combined the K-Means++ algorithm to cluster the user set. It is believed that 5 times the size of the neighbor set N is a reasonable size of the potential neighbor set. Therefore, if the number of users in the target cluster is greater than or equal to 5N, the closest 5N users are selected as potential neighbors; otherwise, all users in the target cluster are selected as the set of potential neighbors. The neighboring clusters are searched, and the set of potential neighbors with the difference is filled according to the distance.
Step 3: Random enumeration of neighbor selection. The neighbor enumeration mechanism is adopted to enumerate the probability set M of all neighbors. Then, the exponential mechanism is exploited to select the neighbor set N. To take into account both security and recommendation performance, the set of potential neighbors is randomly divided into five parts to enumerate all possible results of M in the set of size N.
Step 4: Private neighbor selection under the index mechanism. Since the enumeration is divided into five parts in step 3, the private neighbor selection is also performed five times under the exponential mechanism. Finally, the five subsets with the size of N/5 are merged to obtain a final neighbor set with the size of N. Meanwhile, the privacy budget is also divided into five parts, and the privacy budget selected each time is ∈/2/5. The utility function of the exponential mechanism is designed as follows: Assuming that the target user u is located in the cluster C
u
, and the neighbor set of the target user u N ⊆ L, the utility function is defined as follows:
According to the definition of the exponential mechanism, the probability of outputting object N as a neighbor should be proportional to
As for the privacy neighbor selection of the exponential mechanism, the probability distribution of all cases in L is calculated through Equation (21). Following the probability distribution, a set of neighbors are randomly sampled as the neighbor set N. The complete description of this process is shown in Algorithm 2.
Matrix factorization models usually use SGD for parameter learning and score prediction. However, the gradient descent process is not safe, because the attacker can infer the user feature matrix through the regression function. Therefore, the ITMF model proposed in this section will use the gradient perturbation method and add random noise based on the Laplace mechanism to the gradient descent to achieve differential privacy protection. Suppose that gradient descent requires k iterations, and k is a parameter preset by the algorithm. In this case, the privacy budget in each iteration is ɛ/2k. Since noise is added to each gradient descent iteration, a scoring error is set to limit the excessive influence of the noise. Meanwhile, the local sensitivity Δr is calculated through the difference between the maximum score and the minimum score.
Security analysis
In this section, it will be proved that the DPMFIT model proposed in this chapter satisfies the ɛ/2-differential privacy. The privacy budget in the differential privacy protection model is divided into two parts, i.e., the part using exponential mechanism and Laplace mechanism. It will be proved that each part meets the corresponding differential privacy protection requirement.
According to the differential privacy property (Property 1), if each neighbor selection satisfies the differential privacy, then the combined algorithm composed of private neighbor selection will still satisfy differential privacy. Therefore, according to the definition of exponential mechanism (Definition 5), the probability of arbitrarily outputting N in each private neighbor selection is as follows. In fact, N/5 is randomly output every time a private neighbor is selected. For a convenient expression, it is represented by N, and the privacy budget is the same, which is represented by ∈/2.
According to the differential privacy Laplace mechanism, this step always meets the differential privacy protection of ɛ/2k. Since the algorithm will converge after K iterations, it can be known from the sequence combination of differential privacy that the matrix factorization process meets the ɛ/2-differential privacy. Since Theorem 1 and Theorem 2 all satisfy ɛ/2-differential privacy, from the sequence combination of differential privacy, it can be known that the matrix factorization process satisfies ɛ-differential privacy.
Algorithm 3. gradient perturbation algorithm
Input: user project rating matrix R, number of iterations of SGD k, privacy budget ɛ
Output: latent feature matrix p u and q i
1: Randomly initialize user and item feature matrices p u and q i
2: for each iterations k
3: for each r ui ∈ R
4: calculate L min (R, p, q) /* Calculate the objective function according to formula (14)*/
5: Δr = rmax - rmin
6: Lmin(R, p, q) ′ = Lmin(R, p, q) + Lap(kΔr/2ɛ)
7: update(p u ) /* Update the user characteristic matrix according to formula (17) */
8: update(q i ) /* Update the project feature matrix according to formula (18)*/
9: end for
10: end for
11: return p u and q i
Experimental data set
The dataset used in this paper was captured from Google reviews which present comments on tourist attractions of 24 categories in Europe (Referred to as TravelRating). Google user ratings range from 1 to 5, and the average user rating for each category is calculated. This data set is widely used by recommendation system learning and research. The data set contains 5456 ratings, including 943 users’ ratings on 24 items. The score is an integer ranging from 1 to 5. The larger the value, the more the user likes it.
Experimental parameter settings
The experiment is conducted on a computer equipped with Intel Core2 Quad CPU (Q9500, 2.83 GHz) and 4 GB memory, and the computer runs Windows 7 operating system. All algorithms are programmed in C# language on the Visual Studio 2015 platform. The experiment uses a cross-validation method to divide the data set into 10 groups, and the ratio of the training set to the test set is 8:2. Each time a group of data is randomly selected, and no data is fixed for training or testing only. The experimental results in this paper are obtained under the optimal parameters of each algorithm. Specifically, the learning efficiency γ is set to 0.01; the regularization factor λ is set to 0.1; the number of iterations is set to 50. The experiment uses root mean square error (RMSE) to evaluate the recommendation accuracy and judges the quality of recommendation by calculating the deviation between the predicted score and the true score. The closer the value of RMSE is to 0, the higher the accuracy. The value of RMSE is calculated as follows:
When the predicted score is completely consistent with the true score, the mean absolute error (MAE) is equal to 0, indicating a perfect model; the greater the error, the greater the value. The value of MAE is calculated as follows:
Since two recommendation models, ITMF and DPITMF, are proposed in this paper, the experiment investigates the recommended parameter settings in the ITMF model and the privacy parameter settings in the DPITMF model. Then, the algorithm model is compared and analyzed.
ITMF model performance analysis
Sensitivity analysis is performed on the three key parameters in the ITMF model in advance, including the balance factor between the user interest value similarity and the user similarity under the time factor, the learning factor of the interest regular term in the model, and the number of user neighbors k. Since the values of the three parameters have an impact on the model optimization, the controlled variable method is exploited to fix the other two parameters while the effect of changing one parameter on the model is investigated. The basic matrix factorization algorithm (Basic MF) and the classic SVD++ model that incorporates implicit user feedback [25] are taken for comparison.
(1) The impact of the balance factor α on the recommendation accuracy of the ITMF model To represent the weight of user similarity under the time factor, the similarity ratio α is changed from 0 to 1. The results obtained on the TravelRating data set are shown in Fig. 5. It can be seen that the ITMF model achieves a more significant improvement of recommendation performance than the classic algorithm. Meanwhile, the values of RMSE and MAE fluctuate with the increase of the balance factor α. When the value of α is 0.6, both the values of RMSE and MAE are the lowest, indicating the best recommendation accuracy. At this time, the user similarity under the time factor accounts for 0.6, and the user interest accounts for 0.4.

The impact of the balance factor α on the performance of the ITMF model.
(2) The influence of the learning factor β of the interest regular term on the recommendation accuracy of the ITMF model
The impact of the learning factor β on the performance of the ITMF model is illustrated in Fig. 6. The experimental results show that when the learning factor β is 0.01, both the values of RMSE and MAE are the smallest, and the recommendation accuracy of the ITMF model is the best.

The impact of learning factors β on the performance of ITMF models.
(3) The influence of the number of neighbors k of the target user on the recommendation accuracy of the ITMF model
It can be seen from Fig. 7 that learning the interest and preference of the neighbors of the target user contributes to a more accurate prediction of the target user. When the number of neighbors increases from 1 to 3, both the values of RMSE and MAE decrease significantly, indicating an obvious improvement effect. Besides, the values of RMSE and MAE fluctuate when the number of neighbors changes from 4 to 10. However, the overall changing trend is not downward, indicating that the increase of the number of neighbors gradually stabilizes the improvement of prediction accuracy. Therefore, the number of neighbors with the lowest RMSE in the range of 4 to 10 is taken as the best number of neighbors to facilitate subsequent experiments.

The impact of the number of neighbors k on the performance of the ITMF model.
The DPITMF model has the same user interest offset part as the ITMF model. So, the optimal parameters of the ITMF model described in Section 5.3.1 can be directly used for privacy analysis of the DPITMF model, and the impact of privacy budget on the recommendation results is investigated. The DPSS++ algorithm[16] is taken for comparison, which is a differential privacy protection algorithm based on SVD++ and gradient perturbation.
(1) The impact of the privacy budget ɛ on the privacy protection of the DPITMF model
Figure 8 shows the comparison results on the TravelRating data set. For a low privacy budget, the recommendation performance of the DPITMF model is worse than that of the classic DPSS++ algorithm. The DPITMF model does not spend the entire privacy budget ɛ on gradient perturbation: half of the budget for implicit neighbor identity protection, and the other half for gradient perturbation. The DPITMF model uses only half of the privacy budget ɛ throughout the experiment. The smaller the privacy budget, the more random noise; the stronger the privacy effect, the less data availability. However, with more random noise, the DPITMF model can have the same privacy budget ɛ as the DPSS++ algorithm but achieve a greater improvement in recommendation accuracy. The DPITMF model incorporates user interest offset information to help the model express user characteristics more accurately. Preferences make recommendations more accurate. It can be seen from Fig. 8 that when the value of the privacy budget ɛ reaches 6, the recommendation performance begins to stabilize. Although a larger privacy budget ɛ contributes to a better recommendation effect, the noise introduced will be smaller and privacy protection will be less sufficient. Therefore, considering comprehensive recommendation performance and privacy protection effect, the privacy budget ɛ will be set to 6 in subsequent experiments.

Privacy budgets privacy protection impact on the DPITMF model.
To better demonstrate the recommendation performance of the recommendation model, the ITMF and DPITMF models are compared with the classic SVD++[25] and DPSS++[21] algorithms. The ITMF and DPITMF models use the optimal model parameters determined in sections 5.3.1 and 5.3.2, respectively. As shown in Fig. 9, the abscissa represents the dimension of the feature vector, and the impact of the dimension on the recommendation performance is observed by changing the dimension value.

Recommendation performance comparison of recommendation models.
It can be seen from Fig. 9 that the ITMF proposed in this paper has obvious advantages in recommendation accuracy, and both the values of RMSE and MAE are significantly reduced, indicating that the fusion of user interest offset helps improve the prediction accuracy of the recommendation model. The DPITMF model also performs better than the DPSS++ algorithm in terms of the values of RMSE and MAE. Although the addition of privacy protection has a certain impact on the recommendation performance, the integration of user interest offset alleviates the loss of recommendation accuracy to a certain extent, making it possible to ensure user privacy. There is a reasonable balance between privacy protection and recommendation quality.
Aiming at the user privacy protection requirements of the recommendation systems and the problem that privacy protection technology affects the recommendation performance, a matrix factorization recommendation algorithm based on user interest offset and differential privacy is proposed in this paper. The user interest preferences are extracted from user tags and user ratings under time-series factors, and similar neighbors are exploited to train the user’s feature preferences which are then integrated into the matrix model in the form of regular items. Meanwhile, based on the differential privacy theory, a private neighbor selection method combining K-Medoides clustering algorithm and the exponential mechanism is designed to improve the accuracy of recommendation and satisfy the privacy protection requirements. Besides, the Laplace mechanism is used to protect the gradient descent process of the model and ensure the safety of the recommended model. Finally, the feasibility of the proposed privacy protection scheme is verified by experiments. The future work will further mine user data in social networks, such as the social relationship between users, the influence of authoritative users on ordinary users, etc., to improve the performance of the recommendation system.
Acknowledgments
This research was supported by the Natural Science Foundation of China (No. 61972439), Natural Science Foundation of Anhui Province (No. 1808085MF172) and Key Program in the Youth Elite Support Plan in Universities of Anhui Province (gxyqZD2019010).
