Abstract
Recommendation systems provide reliable and relevant recommendations to users and also enable users’ trust on the website. This is achieved by the opinions derived from reviews, feedbacks and preferences provided by the users when the product is purchased or viewed through social networks. This integrates interactions of social networks with recommendation systems which results in the behavior of users and user’s friends. The techniques used so far for recommendation systems are traditional, based on collaborative filtering and content based filtering. This paper provides a novel approach called User-Opinion-Rating (UOR) for building recommendation systems by taking user generated opinions over social networks as a dimension. Two tripartite graphs namely User-Item-Rating and User-Item-Opinion are constructed based on users’ opinion on items along with their ratings. Proposed approach quantifies the opinions of users and results obtained reveal the feasibility.
Introduction
A recommendation system with many items produces a wide variety of personalizing behaviors for users. This effects applications along with users in wide variety of domains such as movies, shopping, music, research articles and videos [1]. At instance, firms like Amazon, Netflix etc. provides recommendations to a user based on the user activity over such networks for promoting affinity of users towards these networks and increase sales profit. Further they limit the managing costs of picking things and selecting things among a variety of online setting production. Mostly, these systems are ached with cold-start problem and data sparsity [9]. Also, user choice or decision gets affected majorly by their friends’ opinions. Social Recommendation Systems (SRSs) model social behavior of user and recommends products or items to users based on user behavior over social networks. Such systems act on the social relations among users over social networks and help in recommendations.
As a part of literature review social information or opinions generated by the users are captured for providing recommendations (e.g., [1, 5]). Limited methods used ratings given by user for deriving recommendations [6]. These ratings are normally visualized as number of stars against the product. This rating gives a quick review over a product to a user, which seldom guarantees the quality and usage of the product. Reviews or opinions are expressed in a lot more extensive scope of structures and have the choice of being considerably more informative. Analyzing a review usually takes more time. Also reviews have potential to capture accurate experiences and more social data [22]. Hence, a review can be combined with a rating, or can be discrete as an Internet based social network post, blog entry, a report, or even a video [4].
In this paper, we present a new approach User-Opinion-Rating (UOR) for SRS using collaborative filtering [10] and community detection techniques [23] that include user opinions along with the user ratings for arriving at a better recommendation system. User opinion on products is analyzed using natural language processing techniques, and recommendations are predicted by combining rating as well as user opinions. In the proposed approach, we constructed two tripartite graphs namely user-item-rating and user-item-opinion and applied cosine similarity [5] to find the closeness of items. Prediction matrices, which help in determining the recommendations made to the users, are constructed using the tripartite graphs. We also constructed User-User social graph and derived communities using Louvain’s community detection algorithm [23]. The main contributions in this work are:
Proposed an SRS system that considers opinions of the users on items, and constructed two tripartite graphs: (a) user-item-rating and (b) user-item-opinion. Derived two types of prediction matrices (based on ratings and opinions) using User-User social graph, and two tripartite graphs. Performed experiments on Movielens, Amazon and email-Eu-core datasets, and predicted top–k recommendations for the existing users and cold start user.
The rest of the paper is organized as follows. In Section 2, we proposed background techniques associated with an approach and the related work. In Section 3, we present methodology for SRS. Section 4 presents the outcomes and Section 5 concludes the paper.
The major challenges in traditional recommendation approaches are [5, 17] and [20].
Sparsity problem: This rises when most of the users do not rate the item, results in sparse information of item and also effects over quality of recommendations. We addressed the sparsity issue by dividing the whole data in the form of communities and producing top-k recommendations from the community that the user belongs to. We modelled the data as User-Opinion-Rating tripartite graph and matrices are derived from the tripartite graphs respectively.
Sparsity can be reduced by using the community information to which user belongs to. According to the community to which a user belongs to the unrated item can be rated approximately. This helps in improving the quality of recommendations by reducing the sparsity in the data.
Cold-start problem: Recommendations for a new user is difficult as his/her behaviour is new and yet to be learned.
Scalability: Capacity to deal with huge scale information.
Independent and identically distributed: The traditional recommendation systems assume that every one of the clients are detached and identically distributed; this idea dismisses the relations among users that is not as indicated by the significant world suggestions.
A fruitful recommendation system may be developed by considering the following:
Individual users have their own attributes on different items such as movies, books, music, articles and food with different choices. Users might be just guided by the companions they trust. A user’s final judgment is the balance between his/her own opinion and his/her trusty friend’s reviews.
Recommendation systems are classified widely into four groups [7, 12, 21, 22]: Collaborative Filtering, Content-based, Knowledge-based and Hybrid recommendation systems:
Collaborative filtering (CF): At first recommendation systems are based on CF techniques [12, 18, 27], which were relied on ratings given by user over products. These systems are divided into two kinds: user based CF and item based CF. Conversely, these systems suffer from cold start problem. Social recommendation systems
User based collaborative filtering: The working principle of these systems is ratings given by user and neighbourhood over products. Recommendations to user are recommended based on the rating given by other users within the community and the kind of behaviour of user. These recommendations are generated to active user when opinions of similar users were published. This technique uses User-Item-Matrix for predicting the rating of active user based on ratings of similar users over an item.
Item based collaborative filtering: This working principle of these systems is based on the similarity of items. For illustration, if a user purchases a product X then future recommendations will be recommended to user based on products which are similar to the product X. Item Based Collaborative Filtering includes two steps:
Similarity computation: Computes similarity between/ among the items.
Prediction computation: Predicts an item to a user.
Item Based CF overcomes the cold start problem of User Based CF.
Content-based-filtering: The major attribute of this method is description of content. This content is user and item profile, keywords used and the time attribute associated with the content.
Knowledge-based: This method uses knowledge of user’s needs and priorities for producing quality recommendations. This method is not responsive for short term variations, as this is static.
Hybrid recommender systems: It is an integration of several recommendation Systems. It has points of interest of all the methods in combination, and addresses the issues such as sparsity and loss of data. Wide variety of algorithms are used as a result of which the hybrid recommenders systems’ architecture will be more complex.
Relationship such as user-user was not used in traditional recommendation systems although it is very prominent in social networks. These relationships when used, produce quality recommendations. The reason behind this is, a user usually exhibits affinity towards a similar product which was liked or purchased by his neighbours or friends. To predict the missing values, the relationships of users, their opinions and ratings are studied. Models implemented in various SRSs are given in Table 1.
Zhou et al. [25] projected a method SVD which is Singular Value Decomposition method utilizes incremental methodology, which combined the Approximating SVD (ApproSVD) algorithm with Incremental SVD algorithm, called the Incremental ApproSVD.
Park et al. [26] used a k-nearest neighbour (k-NN) graph for implementing a Reversed CF (RCF), a rapid CF algorithm. This methodology switches the way toward discovering k neighbours, rather than discovering
To counter the cold start user problem Deepika et al. [5] proposed Collaborative filtering and Community based Social Recommendation System (CCSRS). Two prediction matrices derived from User-User social graph and User-Item bipartite graph were used in this approach. In this work, CCSRS work is extended by considering the users’ opinion. To relate users and items of interest communities are derived using Louvain’s community detection algorithm [23] for SRS. The mechanism of Louvain’s community detection algorithm is built on the modularity optimization and can investigate suggestively over larger networks. This algorithm is optimized using greedy approach which targets in discovering the best solution at each phase of building the large system of networks. The Louvain algorithm begins with every hub in a system that has a place with its very own locale. At first every network is a singleton, comprising of just a single hub. The algorithm at that point utilizes a separate measure depending on nearby moving heuristic to get an improved network structure. Singular hubs are moved starting with one network then onto the next until no further increment in measured quality is watched. Now, a decreased system is built. Every hub in the decreased system relates to a network in the first organize. In light of the subsequent network structure, a second decreased system is developed. The Louvain algorithm proceeds along these lines until a network is obtained that cannot be reduced further.
Motivation
Every real world entity is associated with lots of information over Internet. Practically a web user before purchasing will look into the ratings, opinions and reviews of the product. The available information may not reflect the quality of product. Hence there is a need to provide recommendations that reflect the quality of the product or item. These systems are based on not only user ratings but also opinions. There are number of methods or techniques available to build recommendation systems. To determine the item ranking, recommendation systems based on CF techniques uses User-Item bipartite graph [5, 6, 14, 15]. The concept of bipartite graphs used for recommendation systems is explained in the next subsection.
Bipartite graph
A Graph
Bi-Partite graph between users and items.
In the traditional recommendation systems, the two sets correspond to Users and Items and the bipartite graph reveals the items in the Items set appraised by the users in the Users set [5]. Figure 1 shows a sample bipartite graph and its adjacency matrix; where,
All in all, when a customer needs to purchase a thing from the market, he will search for number of parameters, for example, quality, durability and performance. Likewise, purchasers rely upon thing opinions and reviews that were posted on sites or from their companions’ inputs. Therefore, recommendation of a thing does not exclusively rely upon either opinion or rating.
Rating of a product is determined by the user built on either ranking a particular product based on its quality or performance compared with other products of the same category. A review (or opinion) is a detailed description shared by the user on his/her experience with the product and aid as recommendation for the other users.
Ratings and reviews both offer knowledge, and both are various approaches to recognize which item, administrations or organizations that shoppers desire to buy and acquire. Reviews can be written in an assortment of structures and have the alternative of being considerably more instructive. A review can be joined by a rating, or can be discrete as a web based social post, blog entry, a report, or even a video. A review may set aside more effort to examine. Therefore, surveys have the potential for genuine bits of knowledge and more data. Star ratings do not give profound experiences into the item related characteristics. Rating is a form of generalization that offers faster access to how a business or service will perform without investing excessive amount of time. These generalizations may have a few constraints. They give the result however not the reasoning behind the rating. Reviews are significant as they uncover all the real world insights. The most concerning issue with surveys is that this data is a point of view, and a sentiment. A mix of the two reviews and ratings is a decent measurement to recognize the genuine estimation of the item. By allowing shoppers the chance to rate and survey the items, customers get the vital data alongside the outlined rating. Hence, for creating suggestions, in this work, we combined the trends of both the reviews and ratings and constructed the two tripartite graphs.
Figure 2 shows the high-level view of proposed approach. The proposed approach mainly consists of three steps:
Detect user communities from User-User social graph.
Prediction matrices (see Section 3.3) are computed using user ratings and user opinions. Generate recommendation of items to users based on the predication matrices values.
A graph is termed as tripartite graph if and only if a graph
Block diagram for proposed SRS approach.
In this proposed work, four sets are taken into account namely Users, Items, Ratings and Opinions to build tripartite graphs.
Consider a set of three users
Community discovery targets at partitioning the network into a set of communities of densely connected nodes; with the nodes belonging to different communities being only sparsely connected. Louvain’s community detection algorithm is used to find these sparsely connected communities.
Example: Figure 4 shows sample user-user social graph consisting of two communities
User-movie-review-rating data
User-movie-review-rating data
User-movie-review polarity-rating data
Example tripartite graph and its representation.
Steps for computing recommendations for a new user
Steps for computing recommendations for an existing user
Consider first 3 users
Calculating prediction matrix values based on user ratings: Using rating model for each user community obtained from the community detection model a
User-Item-Rating tripartite graph is first constructed (Fig. 3).
Communities 
Table 6 shows the notations used in proposed approach. The description of User-Item-Rating tripartite graph model is given by the following equations:
Using user rating Prediction matrix
where
Using Eq. (3.3) Prediction matrix
Notations
Analogous to user ratings, a User-Item-Opinion tripartite graph can be attained using opinion model for each user community. The description of User-Item-Opinion tripartite graph model is given by the following equations.
Similarity computation [5, 13] is performed as,
Prediction matrix
Second prediction matrix
For attaining optimal prediction values for recommendations, the overall
where
Two types of users are considered for giving item recommendations: (a) cold start user and (b) existing user. A new user can be added at any time posing a challenge to find appropriate recommendations to the newly joined user. To obtain recommendations to cold start users, top n items from the maximum of average opinion-rating prediction values are considered.
Recommendation to cold start user
To attain at recommendations to cold start users top
where
Recommendation to existing user
For recommendations to existing user, we combine the values of average opinion-rating prediction values and opinion-rating collaborative filtering values and then top
where Here,
Experimental results show the evaluation of proposed UOR approach using real world datasets namely Movielens dataset (
Prediction coverage
Coverage is defined as the percentage of recommendations produced by the system. Let’s assume
Prediction coverage comparisons for CF, CCSRS and UOR.
Results for First-N items (
Mean absolute error is a metric used to compute the average of all the absolute value differences between the true and the predicted rating [5]. Lower the MAE, higher the prediction accuracy. MAE precisely calculates the variation between the expected prediction and the actual rating preferably assigned by the user of the expected item. MAE generally measures the average magnitude of the errors in a bunch of predictions [5, 11]. MAE measures the average magnitude of the errors in a set of predictions. The Eq. (11) describes the MAE as follows [19].
where
MAE comparisons for SVD, KNN and UOR.
Results show that KNN has better MAE value compared to SVD approach whereas proposed UOR approach has given better value compared to KNN. It specifies that proposed approach outperformed the existing methods.
RMSE computes the mean value of all the differences squared between the true and the predicted ratings and then proceeds to calculate the square root out of the result [5, 8]. Consequently, large errors may affect the RMSE rating, making it very useful metric to bring out the large errors. RMSE is
Results are compared with the Singular Value Decomposition (SVD) [16, 25] and K-Nearest Neighbor (KNN) [26] approaches. Review information for data sets used in existing methods is not present. By combining data sets of Movielens and Amazon reviews data, a new synthesized data set is created. The new data set consists of fields like User ID, Movie Name, Movie Rating and Movie Review. Using Sentimental Analysis, reviews of users present in ‘Amazon Review dataset’ are converted into rating of scale 5.0. The other fields such as user ID are transformed to match the format in Movielens dataset. Figure 7 shows the comparison of RMSE values for the proposed approach with the existing methods SVD and KNN.
RMSE comparisons for SVD, KNN and UOR.
The diagram depicts that KNN has better RMSE value compared to SVD approach whereas proposed UOR approach has shown better results compared to both SVD and KNN approaches.
Precision is absolutely the fraction of the various items that we actually recommended the users concerned [5]. Precision is computed as:
Recall represents the fraction of relevant items that are also part of the set of recommended items [5]. Recall is computed as:
Precision and recall for the existing and cold start user.
If an item
Figure 8 shows the values of Precision and Recall for the existing user and cold start user respectively. In addition to precision and recall, the parameters in the Table 7 give more insight into the performance of the proposed approach. It shows the top-10 recommendations for the cold-start user with user ID: 944.
Top 10 recommendations for a new user with ID 944
In this paper, we addressed two main challenges. The first challenge that was highlighted by many recommendation systems is the cold start problem. The second challenge is to improve the accuracy of recommendations made by the system. This paper proposed a new approach (UOR) to handle the issues with the existing recommendation systems. Three datasets are used for carrying out the experimental work, Movielens, Amazon and email-Eu-core. This work uses user rating and user opinion to predict recommendations. Experiments on synthesized dataset created using Movielens and Amazon instant video review datasets, demonstrates an improved movie recommendation system. The performance of the proposed approach is measured using parameters MAE, RMSE, Precession, Recall and Coverage. The outcomes of the experiment on the datasets indicate that User-Opinion-Rating (UOR) provides better performance with respect to coverage and accuracy when compared with the existing approaches.
