Abstract
Nowadays, we are living in an information overload age. A tremendous amount of information has been produced on the Internet, how to find the interesting information is the main goal of recommendation system research. However, the most of current traditional recommendation algorithms (such as Collaborative filtering) are suffering from flowing difficulties: (i) The traditional recommendation system assume that users are independent and identically distributed; this assumption fails to consider the social relation and connection between users, which is not consistent with the social relations in our real world. (ii) Although there are some recommendation system research began to focus on the trust relationship between users, trust information is also very sparse. This leads to most of datasets only contains very little information about the user’s relationship. In this paper, we propose an innovative method that integrated users’ trust propagation and singular value decomposition into recommendation Algorithm to improve the quality of the recommendation effectively and efficiently. We performed our experiments on two real data sets respectively, the public domain Epinions.com and Filemtrust.com. The experimental results show that our method has a better outperform.
Introduction
Nowadays, with the rapid development of web 2.0, we have entered a data explosion age. The Internet is filled with the rapid growing amount of information, thus how to find the useful information from a mass of Internet information for users have become increasingly difficult problem in the Recommendation System research, and also become one of the hot spots. Recommendation System is based on information filtering technology that try to suggest some items (movies, music, news, webpage, friends, books, etc.) that the users are interesting [1–4, 23].
Recommendation Algorithm has been widely deployed in E-commerce industry such as Amazon and EBay [1, 5]. However, the most of recommendation technology are suffering from several weaknesses. The first well known challenge is that most of traditional recommendation algorithms assume that users are independent and identically distributed. This fail to consider the social relation and connection between users, which is not consistent with the social relations in our real world [2]. According to the Social Selection Theory and the Social Influence, people are easy to contact people who are similar to themselves. Millions of people influence each other implicitly or explicitly through online social network services and then become closer gradually. At the same time, with the popularity of social network (Facebook, Twitter, micro-blog, etc), the relation between users is become one of the most important impact to improve the quality of the Recommendation System [6–10]. Secondly, in order to consider the social information into Recommendation System more and more recommendation system begins to incorporating the social information into recommendation algorithm. However most of datasets only contains very little information about the user’s relationship, and this is the other one challenge of the Recommendation System.
In this paper, aiming at the above questions and challenges, we propose an innovative method that is integrated users’ trust propagation and singular value decomposition (in this paper we called SoTpSVD) into recommendation Algorithm to improve the quality of the recommendation system effectively and efficiently. The main contributions of this paper are summarized as follows: (1) we proposed trust propagation rule about trust propagation that using direct trust relationship of users on the social network to infer the indirect trust relationship between users; (2) we propose an innovative method that is integrate users’ trust propagation and singular value decomposition into recommendation Algorithm to improve the quality of the recommendation effectively and efficiently.
The rest of this paper is organized as follows: we introduced several approaches about social recommendation algorithm that associated with this article in Section 2. We introduced some related definition and trust propagation rules of the social networks in Section 3. In Section 4, we present our method to integrate users’ trust propagation and singular value decomposition into recommendation Algorithm. We also made a comparison between the method proposed in this paper and the existing methods through several sets of experiments. And then further analysis the influence of the experimental on the experimental results of SoTPSVD method in Section 5. In the last section of the paper we summarize our work, and introduce the next research direction.
Related work
In this section, we mainly review several approaches to recommendation system that associated with this article, including: (1) traditional recommendation technology which are mainly based on collaborative filtering techniques. (2) Social trust-based recommender systems [21, 22] which have drawn lots attention recently.
The traditional collaborative filtering [11, 12] is one of the relatively mature recommendation techniques in the traditional recommendation system [6, 17]. Collaborative filtering recommendation technology has been widely used, mainly due to its independence from the contents of the recommended object, relying only on the user’s ratings or preferences can be recommended for the user. At the same time, it can also help users find the potential interest. These advantages make it almost to all of the fields. Most of popular approaches to collaborate filtering are based on low-dimensional factor models. Aiming at many existing approaches to collaborative filtering can neither handle very large datasets nor easily deal with users who have very few ratings, Salskhutdinov et al. [13] present a Probabilistic Matrix Factorization(PMF) model. In the recommendation process, this method predicts the user’s rating on unknown item only by the existing information on the user-item rating matrix. Koren et al. [12] proposed a matrix factorization technique, that is, Singular Value Decomposition (SVD). SVD reduces the sparse problem of user-item matrix by using matrix decomposition technique. This method improves the recommendation quality and the system’s expansibility. Koren et al. [14] are also proposed SVD++. This algorithm is an improved algorithm based on the Singular Value Decomposition to introduce the implicit feedback, and using the user’s historical view of the data, the user history score data, the view history data of the movie, and the film’s historical score data as new the parameter. However, those recommendation systems assume that users are independent and identically distributed. This assume fail to consider the social relation and connection between users, which is not consistent with the social relations in our real world.
In recent years, social networks have been developed rapidly [18, 19], and the user is no longer the simple information receiver, but they can establish the relationship between the users actively. Baozhou Lu et al suggests the importance of trusted relational governance in governing online microsourcing transactions [24]. P. Victor et al believed trust and distrust are two increasingly important metrics in social networks, reflecting users’ attitudes and relationships towards each other [27]. Mehrdad Ashtiani et al. proposed a trust model based on hesitant fuzzy multi-criteria decision making [25]. C. Martinez-Cruz et al. provided a method to aggregate the trust information captured in the trust-ontology and to update the user profiles based on the feedback [26]. Thus, we can see all those above paper are all believed that trust information between users is an important factor not only on the field of Social network but on the other field. Most of the social recommendation algorithms are also currently focus on how to use the social relations among users to improve the accuracy of the recommendation algorithm [17–19]. Ma et al. [2] proposed a social recommendation method based on probabilistic matrix factorization. They connect the social network structure and the user-item rating matrix through the shared user latent feature space. The experimental result shows that their method performs better than the state-of-the-art approaches, especially in the circumstance that users have made few or no ratings. From the perspective of solving the sparse data, Ma et al. [3] proposed a probabilistic factor analysis framework which combines the users’ tastes and their trusted friends’ favors together by an ensemble parameter. The complexity analysis indicates that their approach can be applied to large dataset and their experimental results also indicates their method perform better than the state-of-the-art approaches. Yang et al. [8] proposed a hybrid method that combines both a truster model and a trustee model from the perspectives of trusters and trustees, that is, both the users who trust the active user and those who are trusted by the user will influence the user’s ratings on unknown items. All of these methods have shown that users’ social relations are helpful for improving the accuracy of the recommender system. However most of datasets only contains very little information about the user’s relationship [15], and we also noted that all of these above methods are ignore the trust propagation. In the real world, the trust relations between users can be transmitted. Thus, in this paper, we propose an innovative method that is integrated users’ trust propagation and singular value decomposition into recommendation Algorithm to improve the quality of the recommendation effectively and efficiently.
Trust propagation rule
Related definition
Figure 1 is a toy example of social network, where each node represents a user, and each side represents a social trust relationship.
In social networks, each user u has a group of direct neighbors N u , and the trust value in the trust matrix T = [tu,v] N×N. Which the each entry tu,v denotes the existence of a social relation from u to v, the value is 0 or 1. Zero means no trust and one means full trust. Note that, T is asymmetric in general [9]. Because if the user u trust the user v does not mean that the user v is also trust the user u.
According to the theory of Six Degrees of Separation and the Four degrees of separation theory, all of the node in a trust network need up to five steps to search for any other node. Therefore, the step of the trust propagation should be set as 2 ≤ l ≤ 6, which greater than the propagation path l will not add into the relationship of nodes.
Trust propagation rule
Main idea of algorithm: The transitivity of the trust relationship is an important feature of the trust network. In practical application, every user can not have direct trust relationship with all other users in the network. In order to obtain some information about the relationship between the unknown users, we use the transitive property of trust. That is based on the direct trust between users to figure out the indirect trust between users.
According to the analysis proposed above, we give a description of the trust propagation algorithm as follows:
1:
2:
3:
4:
5:
6: add t uv = 1 to T u
7: add v to U1
8: delete tr uv from TR
9:
10: = Scale(t uv ) where Scale(x)arctan x
11:
12:
13:
14:
15:
16:
17:
18: update T u
19: add w to Ui+1
20: delete tr vw from TR
21:
22:
23:
24:
25:
Complexity analysis: the main time cost of algorithm 1 is computing the trust degree between the user and other users in his community (community means within the scope of the trust propagation step l). This process is actually breadth first traversal the user trust relationship graph, the worst-case time complexity for O (n2), where n is the number of users. In this paper, we introduced the trust propagation step l, and according to the six degrees of separation theory all nodes in the trust network can find other nodes through five intermediate nodes, so the length of the trust propagation should be set to 2 ≤ L ≤ 6. Those more than the trust propagation step l node will not be trusted so will not be calculated. Thus, in the real world the worst-case time complexity is not existing,the complexity of the trust algorithm is far less than O (n2).
The SOTPSVD model
The anatomy of the recommendation model based SVD
In this paper, we believe that the main task of the recommender system is to predict the user’s score for an unknown item based on user-item rating matrix and user-user trust matrix. SVD++ algorithm[14] is an improved algorithm based on the SVD (Singular Value Decomposition) [5] to introduce the implicit feedback, and using the user’s historical view of the data, the user history score data, the view history data of the movie, and the film’s historical score data as new the parameter.
In this paper, suppose we have m users, n items. Item usually refer to retrieval object like music, book etc. User’s rating data of item is stored in rating matrix R = [ru,j] m×n, where the each entry ru,j indicates the rating given by user u on item j. ru,j is an integral figure from 1 to 5, which indicates user’s interest in merchandise items. The bigger the figure is, the more one user loves one item. We let I
u
means the rating set of user u, p
u
means the d dimensional latent feature vector of the user u, then P ∈ Pd×m denotes the user-feature matrix and q
j
means the d dimensional latent feature vector of the item j, then Q ∈ Rd×n denotes the item-feature matrix. Thus R ≈ P
T
Q where P
T
denotes the transpose of P. Hence, the rating given by user u on item j can be predicted by the inner product of the user-specific vector p
u
and the item-specific vector q
j
, that is . In this regard, the main task of the recommendation system is to predict the value of as close as possible to the actual score ru,j. Formally, we can learn the user-feature matrix and the item-feature matrix by minimizing the following loss function:
The rational behind SVD++ refers to a matrix factorization model which makes use of implicit feedback information. In general, implicit feedback can refer to any kinds of users’ history information that can help indicate users’ preference. The SVD++ model is formally described as following equation [14]:
Suppose we have a directed trust network graph G = (ν, ɛ), where the node set denotes all users of social networks, the edge set ɛ denotes the trust relationship between users. Let T = [tu,v] m×m denotes the m × m matrix of the G,also known as social network matrix. Where the each entry tu,v denotes the extent to which user u trusts user v. Let p
u
and w
v
denotes the d-dimension truster feature vector and d-dimension trustee feature vector respectively, thus, we let Pd×m and Wd×m denote the truster-feature matrix and trustee-feature matrix respectively. In order to make the user trust matrix and the user-item rating matrix have a good bridge, we make them use the same user feature vector. Let T ≈ P
T
W to recovering the social network matrix. In this regard, same user ratings, the main task of the recommendation system is to predict the value of as close as possible to the actual trust score tu,j. Formally, we can learn the truster-feature matrix and the trustee-feature matrix by minimizing the following loss function:
Where T u denotes the trust degree set of the user u.
In this paper, we emphasize the influence of trust relationship; hence we have joined the trust relationship on the basis of the SVD++ model in the same way. Formally, the new rating for user u on item j is predicted by:
To sum up, after we have joined the trust relationship we can learn the user-feature matrix and the item-feature matrix by minimizing the following loss function:
Where U j and U i are the set of users who rate item j and i, respectively; and is the set of users who trust v.
Aiming at the problems and challenges of the existing recommendation algorithms, we proposed a new recommendation method (we called SoTPSVD).This method integrated the user’s social network information based on the existing improved singular value decomposition (SVD++) mode in order to improve the efficiency of recommendation system. At the same time, this paper also puts forward a trust propagation algorithm (TPA) used to fill user trust matrix to solve social network matrix sparsity problem.
The basic framework of the SoTPSVD recommendation method is shown in Fig. 2, the main input parameters of SoTPSVD including the user trust relation matrix TR,user rating matrix R,trust propagation step l and the attenuation coefficient p. The main step of this method including calculation of trust degree, prediction rating, generation the recommended list.
1:
2:
3: calculate the trust degree of user u after trust propagation T u = TP (u, TR, l, p).
4: For each j of the user’s u not rating items
5: predict the rating
6:
Complexity analysis
The main time cost of the SoTPSVD is to calculate the object function L and the degradation process of various variables. The time complexity for computering the object function is O (d|R| + d|T|), where, d means the d dimension feature vector. The time complexity for calculating the gradient descent is O (|R| + |T|). In addition, the time complexity for computing trust propagation (TPA) has been given in the above chapters. Due to the sparsity of the trust matrix and the constraint of the trust propagation step l, the time complexity is far less than O (n2). Thus the total time complexity of the SoTPSVD recommendation method is linear with the number of the observable entities in the user-item matrix and the user trust matrix. Therefore, our recommendation method SoTPSVD is effective and feasible.
Experimental analyses
Description of the dataset
In this paper, our experiments use the dataset stem from the entire (trust.mindswap.org/FilmTrust/) website in June, 2011. FilmTrust is also a movie sharing and rating website [11]. The FilmTrust dataset contains 35497 rating data derived from 1508 user ratings for 2071 items. The rating data range from 0.5 to 4.0, and the interval was 0.5. In addition, FilmTrust also contains 1853 trust ratings that are issued by 1642 users. Where, the sparsity of the rating is:
Our experiments also use the dataset collected from Epinions website by Perl using a crawler [15]. Epinions is the consumer’s point of view and opinion website. In this site, the user can express their opinion about the item (such as, car, number, film, software, etc.) by assigning numerical ratings and writing text reviews and the rating range from 1 to 5. Users can also determine whether the views of other users are consistent with their views to determine whether the user is trustworthy (to the trust list) or untrustworthy (to the distrust list). The Epinions dataset contains 40163 users who rated a total of, 139,738 different items at least once, writing, 664,824 reviews and 487,181 issued trust statements. Where, the sparsity of the rating is:
In this paper, we have used two kinds of classical metrics to evaluate the accuracy of prediction, that is, the mean absolute error(MAE) and the root mean square error(RMSE). The definition of MAE is as follows:
The definition of RMSE is as follow:
In this paper, we proposed trust propagation algorithm, where the trust degree is a constant decay with the increase of the length of the path. Thus, We adopt a trust attenuation coefficient p to adjust the speed of attenuation speed of trust propagation. where, the value of trust attenuation coefficient 0 < p < 0.9. Through larger experiment, we estimated a better value of attenuation coefficient by setting different attenuation coefficient. We set the attenuation coefficient p is 0.1, 0.2, 0.3, ⋯, 0.9 respectively, and the data test is FilmTrust and Epinions, runing many times trust propagation algorithm, the MAE (mean absolute error) and the RMSE (root mean square error) of the nine times are compared. The results are shown in Figs. 3 and 4. From the result of the experiment, we found that when trust attenuation coefficient p = 0.9 the value of MAE and RMSE is least and the most suitable for our experiment. Thus, we set trust attenuation coefficient p = 0.9 in ourexperiment.
From the result of Figs. 3 and 4, we can observe that when p is 0.9 the value of MAE and RMSE is minimum. Thus, we set the attenuation coefficient p is 0.9 in the other experiment in our paper.
Experimental results and analysis
In this section, in order to evaluate the performance of the algorithm, we compare the proposed algorithm in this paper with the following algorithms: PMF: this method is a Probabilistic matrix factorization technology proposed by Salskhutdinov et al. in the literature [13]. In the recommendation process, this method predicts the user’s rating on unknown item only by the existing information on the user-item rating matrix. RSTE: from the perspective of solving the sparse data, Ma et al. [3] proposed a probabilistic factor analysis framework which fuses the users’ tastes and their trusted friends’ favors together by an ensemble parameter. SocialMF: Jamali and Ester in literature [17] build a new model (SocialMF) on the top of the SoRec by reformulating the contributions of trusted users to the formation of the active user’s user-specific vector rather than to the predictions of items. TrustMF: Bo Yang et al. in literature [8] build a new model (TrustMF) which integrating twoflod sparse information, the rating data given by users and the social trust network among the same users.
We use 10 folder cross validation to learning and testing. The 10 folder cross validation means divide each data into 10 subsets randomly and each subset was treated as test set in turns, then the rest set as traning set. The cross validation repeat ten times and each time select one of the subsets as test set. Finally, the average test performance as the final result. In other words, In the experiment we randomly select the 90% FilmTrust dataset or Epinions dataset as the training set. 90% means we randomly select 90% of the rating from FilmTrust or Epinions as the training data to predict the remaining 10% of ratings. The model of the recommendation algorithm is learning and adjusting the parameters on the training set. The experimental results using 5 and 10 dimensions to represent the latent feature are shown in Table 1. The optimal parameters of the experiment are determined by our experiments or suggested by previous works. The setting are: PMF: λ
u
= λ
i
= λ
b
= 0.01; RSTE: λ
u
= λ
i
= λ
b
= 0.001, λ
t
= 0.001, α = 0.4 for Epinions and α = 1.0 for FilmTrust; SocialMF: λ
u
= λ
i
= λ
b
= 0.001, λ
t
= 1; TrustMF: λ
u
= λ
i
= λ
b
= 0.001, λ
t
= 1; SoTPSVD: λ
u
= λ
i
= λ
b
= 0.01, λ
t
= 0.01 for FilmTrust; λ
u
= λ
i
= λ
b
= 0.003, λ
t
= 0.002.
We compare the performance of various recommendation models using Filmrust and Epinions dataset respectively. Tables 1 and 2 shows the results of normal view (where the normal view means all rating are used as test set). From the data of the Tables 1 and 2 we can observe that both MAE and RMSE of SocialMF, TrustMF, and RSTE are all lower than those of PMF. This shows that modeling trust relationship in the recommendation system is beneficial to improve the quality of recommendation. Our method consistently outperform than other approaches in all the setting of both databases (Where, *indicates the best performance among all other method). Overall, when we set dimensionality = 5, SoTPSVD improves the performance (MAE/RMSE) as high as 1.28%, 0.12%, 1.82%, 1.79%, in contrast to the best performance among all the other methods (where we marked *); when we set dimensionality = 10, SoTPSVD improves the performance (MAE/RMSE) as high as 1.76%, 1.98%, 3.28%, 1.60%, in contrast to the best performance among all the other methods (where we marked *). At the same time, our method is also outperform than the TrustMF, which similar to our method. This demonstrates that we proposed a novel recommendation algorithm incorporating elaborately processed trust relationship and Singular Value Decomposition is realistic and reasonable.
In order to validate the effectiveness of trust-based algorithms, Tables 3 and 4 evaluate MAE and RMSE of we proposed method (SoTPSVD) and those comparison method (ie. PMF, SocialMF and RSTE) in terms of cold users and unpopular items. Form the data of the Tables 3 and 4 we can observe that we proposed method is equally effective in terms of cold users and unpopular items. Under the same experiment setting, we proposed method SoTPSVD is lower than other recommendation method. Overall, in terms of cold users and unpopular items,when we set dimensionality = 5, SoTPSVD improves the performance (MAE/RMSE) as high as 0.877%, 1.153%, -2.04%, 1.79%, in contrast to the best performance among all the other methods (where we marked *); when we set dimensionality = 5, SoTPSVD improves the performance (MAE/RMSE) as high as -2.16%, 1.828%, -3.97%, 2.329%, in contrast to the best performance among all the other methods (where we marked *).
These experiments clearly demonstrate that in this Epinions dataset and FilmTrust dataset, using trust propagation information can overfitting problem, and that the predictive accuracy can be improved by incorporating the trust propagation information and singular value decomposition.
Conclusions and future work
With the rise of online social networks, the recommendation system has attracted more and more attention as an important branch of the research field of personalized service. In this paper, aiming at the weakness that traditional recommendation system fail to take into account the relationship between the users and the user’s trust propagation, we proposed a novel recommendation method that is incorporating elaborately processed trust relationship and Singular Value Decomposition. we proposed a rule about trust propagation that using direct trust relationship of users on the social network to infer the indirect trust relationship between users; And we also proposed a social recommendation method that integrated trust propagation and Singular Value Decomposition into recommendation system. The experimental results shows that the trust propagation in social network is also play a role int for improving the quality of the recommended.
In this paper, in order to highlight the importance of trust propagation, we only consider the trust information and fail to consider the distrust information in the trust network. However, the distrust information is also important information that help user express the degree of the trust in the trust network. In the further work, we will consider how to build a trust propagation model containing distrust information.
Footnotes
Acknowledgments
This work is supported by the National Natural Science Foundation of China (61363045); The Key Project of Yunnan Nature Science Foundation (No. 2013FA130); Science and technology innovation talents fund projects of Ministry of Science and Technology (No. 2014HE001).
