Abstract
User-based collaborative filtering often considers a set of users who rated on a target item and computes similarities between other users and the target user to select his/her neighbors, then extrapolates the target user’s rating from the neighbors’ ratings. This traditional approach uses only the neighbors’ ratings for recommendation measurement. However, according to our study, dissimilar users whose ratings still significantly influence to the target user’s rating prediction. In addition, to choose a video to watch, a user often takes in to consideration multi criteria. We analyze users’ behavior to choose a video. They often explore genres or tags, then read abstraction before choosing a video to watch. Therefore, their ratings and the information of a video have a strong correlation. Therefore, based on the fuzzy neural network, a new collaborative filtering method for video recommendation is proposed. Here, the fuzzy neural network is used to learn users’ ratings with respect to their behaviors. The proposal here is to adjust a model of the neural network with input is users’ behavior and output is their ratings for each target video. Concretely, the behavior of a user (or user profile) is learned by the users’ ratings and the information of the corresponding videos. In addition, for each target video, all users’ profile who made ratings on it will be collected. Then each profile is treated as an input of the fuzzy neural network and the corresponding rating value is treated as output of the fuzzy neural network. The rating of a user on the target video will be predicted based on the trained neural network. The experiments with netflix dataset reveals that the proposed method is a significantly effective approach.
Introduction
Nowadays, the number of users using online social networks to share their opinions with others about TV programs, is increasing popular. Thus the dissemination of platforms are developed for Smart TV [14, 19] and Social TV [3]. In [3] the NoTube has been proposed to reduce the gap among the social web, TV and customers. With social TV, users can be provided TV program and video (with both of content- and collaborative-based filtering manners) in a personalized social context way [31]. Collaborative-based Filtering (CF) builds a model to predict items that the user may be interested in [28]. The proposed model takes into account the target user’s behavior in the past (i.e. purchased items or ratings given to those items) and similar decisions made by other users. Meanwhile, Content-based Filtering [13, 27] recommends items based on the items’ characteristics and user preferences. The algorithms of CF can be categorized into two groups such as memory-based collaborative filtering (MeCF) and model-based collaborative filtering (MoCF) methods. With MeCF methods, the items preferred by ones who share similar preferences as the target user will be recommended [7]. With these algorithms, all ratings, items, and users are stored in memory. In case of MeCF methods, first, the collection of ratings is used to train models to identify patterns in the input data [11]. Then, a list of recommended items based on the models will be generated. In addition, the storing of training data in memory is delayed until a recommendation is made. Conversely, with MoFC methods, based on the training data, the recommender system will generalize a model before recommendation made. In addition, MeCF methods are suitable with the problems of less parameters to be tuned. However, they seem not to deal a principled manner in case of data sparsity [5].
In Social TV, users can access their appropriate TV programs based on recommendation systems which have been developed by taking into account viewing history data, mapping social users’ preferences and TV program attributes [2, 6]. In [5] the authors have proposed a hybrid approach combining content-based and collaborative-based filtering methods for recommending TV program. Singular value decomposition technique [12], which aims at reducing the dimension of the user-item representation, is applied to eliminate overloading computation of CF. In this technique, a good behavior in the TV domain is shown by employing the low-rank representation for generating item-based prediction. In [18], based on user comments, the authors have developed a framework for adaptive news recommendation in social media. A topic profile (as weighted graph) is built based on user comments. The standard TF×IDF model [4] and variant of the PageRank algorithms [8] are applied to generate weighted importance of topics. Then, relevant news is selected from a set of news articles in the database using the constructed topic profile. For this aim, a retrieval module combining of the strengths of two state-of-the-art news retrievers time factor [9] and language model [15] is constructed.
In the previous video recommendation methods, single rate is often used as the input. Besides, they make extrapolation among similar users or items. Most recently, Lee et al. [17] have proposed a graph-based recommendation system. In this approach, user-item matrix is converted to a weighted adjacency matrix to create the graph of items with links representing positive relations. By using the occurrence values of each node, the items’ relevance to the target user is identified. In addition, the novel items are produced by using the entropy values of the nodes. Before recommending items from the novel and relevant items, they are filtered. The most popular recommendation strategy is currently considered to be the collaborative filtering technique, in which the similar users are usually identified by similar preferences (the so-called neighbors). The most important reference in collaborative filtering is user ratings. However, the number of ratings, which are already obtained, is usually very small in comparison with the number of ratings for measuring. This phenomenon is often understood as sparse data that is a situation in which a few items selected from a lot of items in the system to share with other users [1]. Authors [30] propose a recommendation mechanism using association rule. Recently, a new strategy is proposed to improve the neighborhood formation based on semantic reasoning in order to overcome the aforementioned fake neighborhood problem [20]. This approach aims at measuring semantic similarities between different items to overcome common rating-based similarity. The most popular approach for sparsity problem is hybrid method [16], integrating different strategies such as demographic-based, collaborative-based, and content-based recommendation into one hybrid method. The conflict recommendation among methods (versions) can be resolved by using consensus method [26].
According to our observation, a user often considers multiple criteria to choose a video to watch. In a social media, a video is often associated with its information including genres, abstraction and tags. We also analyzed users’ behavior to choose a video [25]. In order to choose a video to watch, users also read the abstraction besides genres or tags. From this fact, we can state that there exists a strong correlation between users’ ratings and videos’ information. Therefore, we proposed a novel model-based collaborative filtering using a fuzzy neural network to learn users’ rating with respect to their behavior for video recommendation. The main idea of the proposed method is adjust a neural network model. In this model, the input is the behavior of users (which is learned from the users’ rating and the information of the corresponding videos) while the output is the users' ratings. The behavior of a user here is called a user profile. Deal with each target video, all users’ profile who rated on it are collected. Then, each user profile is treated as an input of the neural network. Besides, the output of the neural network is the user’s rating value. The rating of a user to the target video will be predicted based on the trained neural network. The experiments with netflix dataset reveal that the proposed method is a significantly effective approach.
User behaviors-based collaborative filtering using ANFIS
Approach overview
The ANFIS’s process is described in Fig. 1. A user behavior is learned from rating values and video’s information such as genres, tags, and abstraction. The feature collective (including keywords) are extracted from above the video’s information. TF/IDF model for the feature collective is applied to generate the user profile. Notice that the TF of a term is the combining of the term frequency and the rating value for the corresponding video whose information contains the term. The number of nodes in the hidden layer of the neuro-fuzzy network is identified based on clustering of the set of users’ profiles. The vector created by transforming a user profile from computing the distance between the user profile to clusters, is treated as an input of the neural network. Notice that the number of the components in this vector is equal to the number of cluster. The output is the corresponding rating value. The rating of a user to the target video is predicted based on the trained model of ANFIS.
Profile modeling
The definition of a user profile, which is presented in the form of a weighted vector [10], is as follows:
The term frequency inverse document frequency (tf - idf) weight is a statistical measurement representing the importance of a term in a document in a corpus. It is often used in the field of information retrieval and text mining. In this paper, the traditional vector space model (tf - idf) is used for defining the feature of the documents [10].
- |D|: the size of the corpus (or the number of documents).
- |d : t i ∈ d|: the number of documents that contains the term t i . However, it is common to use 1+ |d : t i ∈ d| because of the division-by-zero in case a term does not appear in the corpus.
In this paper, information of a video is considered as a document. Then the whole videos’ information of video database is considered as the collection. In addition, all of the videos’ information, which is used to create the user profile, is collected because each user may rate on many videos. The process of user profile creation is described as the following example:
Notice that netflix video database does not provide video’s information. Thus, linked-open data 1 is used to get the videos’ features. For this aim, a Web Crawler (see Figs. 3, 4) is used to get the video’s information such as genre, description, tags, etc.
Although, in general, the whole netflix dataset has more than 40 genres, there are 37 different genres in the whole netflix dataset (some genres such as “Sport” and “Sports”, etc. which have the same meaning and are treated as the same genre). The genres are as follows: Action, Adult, Adventure, Animation, Anime, Biography, Children, Classics, Comedy, Crime, Documentary, Drama, Faith, Family, Fantasy, FilmNoir, Fitness, Foreign, Game Show, Gay, History, Horror, Independent, Lesbian, Music, Mystery, Romance, Sci-Fi, Short, Special Interest, Spirituality, Sport, Talk Show, Television, Thriller, War, Western. Thus, 37 the genres are used as user-profile’s feature for demonstration.
The genre frequency here is calculating by taking into account the rating of the corresponding video. It is the combination of the real genre frequency and rating-score (see Fig. 3).
- term action occur in |D| is |d : t i ∈ d|=372, 014
-
- the frequency of term action is f i = 46
- we have .
We assume that user u i rated a set of videosM i = {m1, m2, …, m k }. The profile of user u i is described by a feature vector , where c i is a term from videos’ information in M i and is generated by using vector space tf/idf as aforementioned. A set of n users who rated on video m j (j = 1 . . k) is denoted as U j = {u1, u2 … , u n }. The rating-score of user u i on video m j ∈ M i can be denoted as . Then, the set of rating-scores of users in U j on a video m j is denoted as . For each movie m j , a mathematical relationship between an input users’ feature vectors in U j (denoted by {C1, C2 … , C p n }) and a output of the rating-score space in Y j . For a given data set of users U j , the relationship is described as follows: where C i is the feature vector of ith user of the given data set U j and is rating-score of user u i on movie m j . This mathematical relationship is expressed by a black-box-typed model. Intuitively, this process can be considered as the process of system-identifying. The model works as a mathematical function f which expressed by a mapping as follows:
The above mathematical model is a combination of a fuzzy inference system (FIS) and a neural network structure (NNS). It is expressed by a fuzzy-neuron structure (FNS).
Generally, FIS is built by using the algorithm establishing an adaptive neuro-fuzzy system, ENFS [23]. The same features or characteristics of the object, which are described by hyperbox-typed data clusters (based on data-driven method), can be considered as a structure to establish fuzzy sets and membership functions which then are used for building the FIS. Based on constituting clauses depicting the fuzzy relationships typing MISO, the fuzzy deducing rules in the FIS are built as follows:
Assuming that the pattern-set T
t
is covered by the t
t
h min-max hyperbox HB
t
. The HB
t
is established by using two vertexes, the max vertex and the min vertex, where and . If T
t
contains patterns with the same label m, then the HB
t
will be considered as a pure hyperbox (denoted ). An HB can be considered as a crisp frame on which different types of membership functions (MFs) can be adapted. Here, the original Simpson’s MF is adopted, in which the slope outside the HB is established by the value of the fuzziness parameter γ.
The training algorithm can be organized into 4 following steps as Fig. 6:
Based on algorithm Le-venberg-Marquardt, the NN is trained using The NN-set. - The values of MFs is calculated based on Equations (10) and (11);
- The following equations are used for calculating the output of the neuro-fuzzy network:
Stopping condition: the error is calculating based on output of the NN-set and corresponding depicting output of the NN as follows:
- If E N ≤ [E]: the structure FNS based on the NN is chosen;
- If E N > [E]: N=N+1 then back to Step 3.
Dataset
The proposed method is experimented with netflix dataset [21, 22] containing 14,707,483 ratings given by 460,936 anonymous NetFlix’s customers over 17770 videos, from 1999-11-11 to 2005-12-31. In this dataset, the values of rating scale are: 5 (excellent), 4 (good), 3 (good), 2 (fair), and 1 (poor).
Although, in general, the whole netflix dataset has more than 40 genres, there are 37 different genres in the whole netflix dataset (some genres such as “Sport” and “Sports”, etc. which have the same meaning and are treated as the same genre) as shown in Session 2.1.
The user’s rating scores are stored into a table named rating, It has 14,7 million records, each record represents a single rating of one video_id by one user_id, and some additional information as. There are 14,7 million anonymous ratings, the lowest rating score has counted is 1 score with 632 thousand ratings, after that is 2 scores with 1,4 million ratings, 5 scores with 3 million ratings and the highest rating have counted is 4 scores with 4,7 million ratings,follow that is 3 scores with 4 million ratings. As shown in Fig. 10.
In experiments, first a number of videos are randomly selected (here 4 videos with id 2464, 2548, 2848, 30 are selected). Then for each video, we collect 1100 users who rated on the video. The users later are divided into 2 set, training set involving 1000 users and the rest is testing set.
Evaluation methods
Accuracy metrics
In this paper, the quality of recommendation is measured by using MAE (Mean Absolute Error) and RMSE (Root Mean Squared Error). Concretely, the definition of MAE and RMS are as follows:
According to the their definitions, the lower the values of MAE and RMSE, the higher the accuracy of prediction. In addition, in case of RMSE, the large errors will be penalized heavily because of errors squared. For evaluating the proposed method, we make a comparison between our approach and matrix factorization approach [29]. In that approach, both users and items are mapped to a joint latent factor space of dimensionality, such that user-item interactions are modeled as inner products in that space.
The following figures (Figs. 11, 12, 13 and 14) show results of 100 testing samples for movies 30, 2464, 2578, 2848.
In the case of video’s id 30, the number of samples (with rating-scores 1 and 2) is very small in comparison with that of samples (with rating-scores 3, 4, and 5). Thus, the accuracy of prediction for tests (with rating-scores 1 and 2) is low, shown in Fig. 11.
The predicted values of videos 2464 and 2848 are more close to the corresponding real values than that of video 30. This is due to the fact that in the training datasets of videos 2464, 2848 the rating-scores 1, 2, 3, 4 and 5 are significant balanced. Therefore, their test results are better (see Figs. 12, 13).
In the case of video 2578, the results are the best because these samples are more balanced (in term of the rating-scores) in comparison with the samples of movies 30, 2464 and 2848 (see Fig. 14).
According to the comparison of MAE and RMSE between videos 30, 2464, 2578 and 2848 (see Fig. 15), we can see that in case of video 30, the difference between MAE and RMSE is significant because the difference between the predicted rating of testing samples (with rating-scores 1 and 2) and the real rating is very big. From this fact we can state that the more balanced the samples’ rating-scores, the more accurate the accuracy of prediction. Moreover, our approach (ANFIS) is more effective than MF approach in the both of cases samples’ rating-scores are either balance or not balance.
Conclusion
In this paper, a behavior-based video recommendation has been proposed. The behavior of a user, which is presented by a feature vector, is learned based on the users’ ratings and the information of the corresponding videos. All users’ profiles of each target video are collected. Later, each of them is treated as an input vector of the fuzzy neural network while the output of the fuzzy neural network is the rating-score of the corresponding user. The experiments with netflix dataset reveal that the proposed method is more effective than a well-known method called matrix factorization. According to the results, we can say that users are not similar to the target user, whose ratings still significantly influence to the target user’s rating prediction. In addition, the more balanced the training samples’ rating-scores, the more accurate the prediction.
Footnotes
www.imdb.com and ![]()
Acknowledgments
This work was supported by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under Grant No. 102.01-2013.12.
