Abstract
Modelling users’ dynamic preference for personalized temporal recommendation has been a hot research topic. Traditional dynamic recommendation models divide a user’s interaction history into fixed-sized windows to learn the user’s evolving preference. This strategy however worsens the issue of data sparsity in that some sessions may have very little or even no interaction for preference inference. To alleviate the data sparsity issue and avoid errors due to data imputation that is commonly adopted by existing models, a novel session-based dynamic recommendation model that divides a user’s interaction history with dynamic window size is proposed. An empirical study on the users’ activity life cycles using real-world dataset is conducted to demonstrate the nonuniformness and aggregation of the users’ behavior patterns on the time dimension. Based on the study, a user’s interaction history is divided with dynamic temporal window size by using dynamic clustering. The user’s evolving profile is then constructed by modeling her preference in each session using Latent Dirichlet Allocation (LDA) and a time sensitive weighting scheme. Our dynamic model is designed for two major recommendation tasks: (1) top-N recommendation, which is provided by measuring the relevance of probabilistic topic distribution between the user’s profile in the next temporal domain and each candidate item, (2) rating prediction that is achieved by finding
Introduction
With the rapid growth of the Internet, huge amount of information is generated every moment on the Web. By retrieving useful information, personalized recommendation has become a popular method to handle information overload. Existing recommendation models can be classified into three categories: (1) Content-based filtering, which compares correlations between the content of items and a user’s preference and recommend items similar to the user’s consumption history [1]. (2) Collaborative filtering (CF), which is the most successful recommendation algorithm, aims at predicting a target user’s preference based on similar users’ or items’ feedback information [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]. (3) Hybrid methods, which combine the content-based and collaborative filtering methods in different ways [12, 13, 14, 15, 16, 17, 18].
Traditional recommendation models assume static user preference patterns. However, in real-world recommender systems, users’ preference may drift over time. For instance, in movie recommendation scenario, a user who is fond of action movies may change his preference to romance movies due to various factors, e.g., having a new relationship. Time is thus an important factor in capturing users’ dynamic preference for personalized recommendation. Quite a few works [19, 20, 21] considering temporal dynamics have been proposed in recent years. Some methods [19, 20] introduced time as a universal dimension shared by all users. However this assumption may not always make sense because the time dimension is a local effect. Koren [21] modeled time factor for each user separately in a factorization mode, but the performance improvement was not significant. On the other hand, session-based CF [22] split a user’s interaction history into fixed-sized time windows (sessions) to capture the user’s behavior patterns. Particularly, some works [23, 24, 25, 26] combined Latent Dirichlet Allocation (LDA) with session-based CF, where the probabilistic topic distribution for each session was derived to model users’ dynamic preference for recommendation. However, dividing users’ behavior histories into fixed-sized windows further worsen the data sparsity issue, which refers to the corresponding user-item rating matrix is prevail with missing values because most users rate only a small number of items and a popular item is rated by a few users. That leads to some sessions may have very little or even no interaction information for model building. To handle this issue, cross-domain CF was applied to leverage knowledge across different domains. For instance, in [27], the authors proposed a rating-matrix generative model for effective cross-domain CF to fill the missing rating for both existing and new users. However, the shared knowledge across multiple related domains may be well hidden and not easily obtained. Moreover, filling missing interaction information inevitably introduces noisy data which may influence the accuracy of the model.
In order to alleviate the data sparsity and avoid errors due to data imputation, in this paper, by combining temporal contexts, LDA and dynamic clustering [28], we model session-based temporal dynamic recommendation where a user’s interaction history is segmented using dynamic temporal window size. We first provide an experimental study on users’ activity life cycles using real-world datasets and demonstrate that dynamic clustering is a promising approach to divide users’ interaction histories due to the nonuniformness and aggregation of the users’ behavior patterns on the time dimension. Then, we conduct dynamic clustering on each user’s consumption history based on the dynamic temporal window size, and determine the best number and partition of sessions for each user. We construct a user’s profile based on probabilistic topic distribution by using LDA for each session and a time sensitive weighting scheme. The topics are latent factors that represent a user’s preference on particular items. Specifically, we treat the user set as the corpus, and each user corresponds to a document; for each document, the interacted items of the target user are treated as words, and the corresponding ratings are used as the frequency.
Our model is implemented for two common collaborative filtering tasks: top-N recommendation, which measures the relevance of probabilistic topic distribution of a user in the next temporal domain and that of each candidate item, and rating prediction, which finds
The contributions of our work are summarized as follows:
We demonstrate the nonuniformness and aggregation of users’ behavior patterns on the time dimension by studying real datasets. The analysis shows that it is unreasonable to divide the entire interaction history with fixed-sized time windows which can worsen the data sparsity issue. We apply dynamic clustering to find the optimal session number to divide a user’s interaction history, and derive the user preference drift and trend patterns over time by using LDA in each session and the weighted time framework. This approach is able to alleviate data sparsity issue and avoid the limitation of data imputation methods that try to fill the missing interaction information from different domains. Comprehensive experiments conducted over real world datasets demonstrate that our model significantly outperforms the state-of-the-art dynamic recommendation models.
The remainder of this paper is organized as follows. In Section 2, we provide a brief overview of related works for recommender system in general and temporal dynamic recommendation models. Section 3 presents our novel algorithm for temporal domain division based on dynamic clustering with unfixed time window size. Section 4 elaborates the user profiling generation based on temporal domain division by using LDA. The methods for top-N recommendation and rating prediction are introduced in Section 5. In Section 6, we conduct real data based experiments to evaluate the performance of our approach, followed by conclusions and future works outline in Section 7.
In this section, we briefly review recommendation models, which can be categorized into three groups: content-based filtering, collaborative filtering and hybrid methods. We also discuss the techniques of recommendation considering the temporal dynamics and highlight the difference between our proposed method and other existing works.
Content-based recommendation
Content-based filtering [29] tried to recommend items similar to a user’s interacted items. The similarity was typically measured using content information such as texts. For instance, based on texts, items were represented using the vector space model (e.g. TF-IDF) [30, 31], or probabilistic topic distributions obtained by language models, e.g. probabilistic latent semantic indexing (PLSI) and LDA [23]. However, recommendation purely based on content-based filtering approaches suffer from many drawbacks such as strong dependence on the availability of content, constant user preference (over-specialization) and so on.
Collaborative filtering
Typical CF approaches collectively model past feedback information, and recommend new items to a user based on the feedback information of other similar users and/or items. Memory-based methods [2, 3] found
Hybrid filtering
Hybrid filtering [13, 14, 15, 16, 17, 18] explored different strategies to combine content-based filtering and collaborative filtering for performance improvement. In [14], a tag recommender system was proposed by fusing collaborative filtering and content-based recommendation technique. The former exploited the user’s and the community’s tagging behavior for producing recommendations, while the latter exploited heuristics to extract tags directly from the textual content of resources. In [16], the authors proposed a hybrid approach for movie recommendation. The recommendation is generated by triggering either a content-based prediction or a collaborative filtering prediction based on the monitoring of certain model parameters.
Recommendation with temporal context
Contextual information such as location, time, emotion, environmental conditions have been widely used to describe and characterize the items we like. Time is an important factor that influences recommendation accuracy. Most time-aware models add temporal aspects into the collaborative filtering model. For example, in [19], Sun et al. introduced time as a universal dimension shared by all users. In [20], the authors treated temporal domain as the third dimension to construct a user-item-time tensor, where tucker decomposition was applied to factorize the tensor. Khoshneshin and Street [34] dynamically assigned users/items to different clusters based on evolutionary co-clustering. Ding and Li [35] proposed a framework to compute the time different weights of diverse items. Koren [21] proposed to track the time changing behavior throughout the life span of the data and modeled time factor for each user separately via CF. Liang et al. modeled long-term and short-term preferences simultaneously by using session-based temporal graph [36]. Lathia et al. [37] investigated the temporal diversity of items to generate top-N recommendations.
In order to model user-interest distribution over items, one possible approach was to use probabilistic topic models, e.g. Latent factor model (LFM), PLSI and LDA [38]. Session-based CF [22] used session information to capture sequence and repetitiveness in the process. In [39], Zheleva et al. developed a session-based hierarchical graphical model using LDA. In [25], Li et al. selected news groups using long-term profile and recommended candidates based on the short-term user profile by comparing probabilistic topic distribution using LDA and increased the diversity of news list by using absorbing random walk model. In reference [40], Hong et al. formalized a user’s long-term preference using the item taxonomy, which identified the exact stage of user and provided an item list to the user through other similar user within the same stage by using clustering-based and graph-based recommendation method.
Most session-based methods divided a user’s interaction history into some sessions using fixed temporal domain size, which however suffered from serious data sparsity. To handle this issue, a line of research rely on data imputation techniques was proposed. In [27], Li et al. proposed a rating-matrix generative model (RMGM) for effective cross-domain collaborative filtering to fill the missing rating for both existing and new users. In [38], the authors modeled user-interest profile drift over time by using RMGM to fill missing scores over temporal domains and adopted the cross-domain CF framework to share the static group-level rating matrix across temporal domains. LDA based user-interest distribution over item groups was designed to drift slightly between successive temporal domains. These models were built based on the assumption that each user has multiple counterparts over temporal domains and successive counterparts were closely related. However, due to the influence of external factors, the user interest in the adjacent temporal domains may change dramatically. Moreover, the shared knowledge across multiple related domains may be well hidden and not available.
Different from existing approaches, in our work we split a user’s interaction history by using dynamic clustering, which flexibly creates sessions with variable time window lengths. Users’ dynamic preference in different sessions is represented and learned using LDA and a decreasing time framework.
Temporal domain division based on dynamic clustering
Analysis of users’ behavior patterns
We choose two datasets, i.e., the Movielens dataset and the Eachmovie dataset that are widely used in recommender systems community to analyze users’ behavior. The Movielens dataset records 14,111 users’ ratings to 9,180 movies from January 1st, 2006 to January 5th, 2009. Eachmovie dataset contains 36,658 users’ ratings to 1,613 movies from January 9th, 1995 to September 15th, 1997. Note that for both datasets, the users with less than 20 ratings are removed.
Users’ activity life cycle.
We analyze users’ daily activities, which are measured by the number of watched items in their life cycles. Based on statistical analysis of users’ behavior characteristics in the time dimension, users can be divided into three categories according to the length of active time of the user after we split the user’s interaction history into sessions with a temporal window: Short Period Active User (SPAU), Middle Period Active User (MPAU) and Long Period Active User (LPAU). The three kinds of users are shown in Fig. 1 using Movielens dataset. We observe that the active periods of most SPAU (line one) are very short and their interaction behavior is concentrated in 1 to 2 sessions while other sessions have little interactions. The active periods of MPAU (line two) are distributed into 3 to 6 sessions if we divide the user’s interaction history into 9 sessions with temporal size of 120 days. The difference of movie number for each temporal domain is bigger and each user’s life cycle is different from each other. Finally, the active periods of LPAU (line three) are distributed across the whole history (the number of sessions are greater than 7) which is longer and the number of watched movies in each session is relatively stable. In reality, the fraction of LPAU is very small and most users are SPAU and MPAU. Specifically, SPAU, MPAU and LPAU are 12,329, 1,525 and 257 in Movielens; 34,938, 1,407 and 313 in Eachmovie respectively. The above analysis shows that users’ interaction information is extremely sparse in time dimension because most users are active only in a few temporal domains. It is thus important to flexibly segment a user’s interaction history with variable time window size to better model the user’s dynamic preference. Of course, the category of a user belonged to will be varied with the change of user interaction behavior and regularly update the results of dynamic clustering offline. The classification of a user can be changed from SPAU to MPAU or even LPAU if the user’s interactive behavior gradually increase. On the contrary, it is possible that the category of a user will be changed from LPAU to MPAU or even SPAU if the user has no new interaction. In the next subsection, we elaborate our dynamic clustering based method.
We ignore the temporal domains without interaction information when we model a user’s preference. Specifically, in our problem, it is not trivial to derive the answer for how many sessions for each user are appropriate. Therefore, in our method, we employ dynamic clustering to find the best segmentation for each user, in which the optimal number of cluster is not simply predefined, but is determined by computing item density and dynamic merging items when the size of the smallest temporal domain is given. First, we set the smallest temporal domain with the size of
[h]
Temporal domain division based on dynamic clustering Input:the interaction history for all users
User profile generation based on temporal domain division
After segmenting a user’s interaction history, in this section, we introduce the user profile generation based on probabilistic topic distribution by using LDA and the time-weighting scheme. Our first goal is to segment a user’s interaction history into multiple time frames based on the results of dynamic clustering, and then build the user profile using a deceasing time weight framework. Such a user profile enable us to capture the user’s dynamic preference, which can help us to predict the probabilistic topic distribution in the next temporal domain for recommendation.
The probabilistic topic distribution for each user by using LDA
Latent Dirichlet Allocation (LDA) is one of the most popular probabilistic topic models, which has been widely used in different contexts to uncover the hidden structure in large text corpora. While in recommender system a user’s preference is influenced by some latent factors. In our proposed method, we employ LDA as the language model to detect the probabilistic topic distribution for each user which represents the latent factors influencing the user’s preference. The general framework has a natural interpretation when dealing with users’ preference data: the set of users define the corpus, each user is considered as a document, the items purchased are considered as words, the ratings are considered as the appeared frequency and finally the topics correspond to the latent factors that influence users’ behavior with respect to items, i.e., users’ preference.
Formally, we define a set of users as
where
Here,
Since the latest stage represents a user’s current preference, the probabilistic topic distribution of this stage should carry more importance than the older ones to recommend new items for user. We thus define a monotonic decreasing function
where
Once we obtain the user profile, we can easily derive a user’s recent preference, and then make recommendation. In this section, we implement our dynamic recommendation model for two major collaborative filtering tasks, i.e., top-N recommendation and rating prediction.
Top-N recommendation
Based on the topic distribution in each stage, we can predict the probabilistic topic distribution of
By considering the probabilistic topic distribution
The item probabilistic distribution on topic
For a given user, our method automatically calculates the cosine similarity between the probabilistic topic distribution
We can obtain the probabilistic topic distribution of all users in the next session and then calculate the cosine similarity between all users. So we can easily find the maximum correlation similar user set for a given user. We can then predict user
The process of temporal recommendation based on dynamic clustering consists of three steps. The first step is to find the optimal cluster number and interaction history segmentation for each user by using dynamic clustering. Then, we obtain the probabilistic distribution by using LDA for each user in each stage. At last, we get the items recommendation results. Supposed that there are
Experimental evaluation
In this section, we conduct comprehensive experimental evaluation to show the efficacy and effectiveness of our proposed method. In Section 6.1, we introduce the experimental settings including the datasets used in the experiments, the existing approaches that are compared with our model and the metrics for performance comparison. In Section 6.2, we present the experimental results including key parameters finding, model validation and comparison studies.
Experimental settings
Datasets
We use the MovieLens dataset and Eachmovie dataset to evaluate all recommendation models. For both datasets, we select users with more than 20 ratings and movies with more than 5 ratings for experiment, That is the usual way in recommendation. The MovieLens dataset consists of 2,254,501 ratings provided by 14,111 users on 9,180 movies, whose ratings are from 0 to 5 with the interval 0.5. The Eachmovie dataset records 2,580,245 ratings from 36,658 users to 1,613 movies, whose ratings are from 0 to 1 with the interval 0.2. Each entry contains a user and movie ID, the rating and a timestamp indicating when the movie was watched. The biggest problem of them is that it is unreasonable to rate many items by some users in a very short period, which does not affect the subsequent recommendation process. Therefore, all ratings were simply standardized into 1 to 10 in the data preprocessing stage to the consistency of results.
Comparison
We evaluate the proposed method by comparing with other state-of-the-art temporal recommendation algorithms:
TItemKNN [35]: A time-weighted item-based collaborative filtering method that reduces the influence of old data when predicting users’ future behavior. TUserKNN: A time-aware user-based collaborative filtering that reduces the influence of nearest neighbours who rated the target item in the early time when predicting users’ future behavior. TimeSvd++ [21]: A matrix factorization model that incorporates temporal dynamics, where user bias, item bias and user latent factors are modeled as functions of time. LDA-based with the equal temporal window size (E-LDA) [23, 25]: This approach divides users’ interaction history with fixed sized temporal domains and obtains the probabilistic topic distribution for a user by using LDA after filling sparse data with matrix generation algorithm.
Now, a recommendation performance can be evaluated from many perspectives, such as accuracy, diversity, novelty and coverage. Accuracy is the basic criteria for evaluating recommender systems, which measures the degree of users like to recommended items. It cab be classified into prediction accuracy, prediction ratings correlation, classification accuracy and ranking accuracy. Based on the recommendation process of our method, we use two widely used metrics, Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), to measure the performance of rating prediction. Top-N recommendation is measured using precision (denoted by
Selection of the size of the smallest temporal domain
Accuracy and computational complexity are the two criteria when we decided the temporal domain size
Impact of temporal domain size.
We observe that the MAE of Movielens is basically decreasing with the increase of the temporal domain size. For Eachmovie, the MAE also decreases gradually but when the temporal window size is roughly 140, the MAE starts increasing.
Impact of the temporal domain size in the maximum session number.
The maximum session number that is determined according to
Figure 4 illustrates the number of users whose interaction histories are divided into the same number of sessions after dynamic clustering with different
Number of users with the same number of sessions.
Impact of the topic number.
Since the topic is latent variable in LDA, the optimal number of topics
Model validation: Top-N recommendation
To validate the performance of our model in terms of top-N recommendation, we choose the recall and precision as our criteria and recommend items (top@10, top@20, top@30) to a set of randomly selected users (100 users) by using the top-N recommendation model presented in Section 5.1. We plot the averaged precision and recall over 5 times running in Fig. 6. We can see that both precision and recall are improved with the increase of the recommended size. Moreover, the performance distributions are more compact, demonstrating the effectiveness and stability of our method.
Precision-recall plot for different datasets.
In this experiment, we choose MAE and RMSE as criteria to measure the accuracy of rating prediction of the method presented in Section 5.2. In the neighborhood-based algorithms, the number of similar neighbors
Performance with different nearest neighbor size.
Performance comparison with different 
Comparison of top-N recommendation results
Rating prediction performance comparison.
To evaluate the effectiveness of our method (called D-LDA in the experiments), we compare our method with several representative baselines (see Section 6.1.2).
We first compare our D-LDA with another session-based method E-LDA. Remind that
The results demonstrate that D-LDA obviously outperforms E-LDA in the same degree of data sparsity, which proves that our approach D-LDA can better handle data sparsity issue.
Table 1 summarizes the performance of all methods when different datasets are used. For each approach, we compute precision (P), recall (R) and F-score (F) to measure the performance of top-
We observe that LDA-based methods outperform KNN-based methods in terms of precision, recall and F-score. TItemKNN and TUserKNN perform similarly. It is evident that users’ profile can be better captured by using LDA. This is because LDA represents or characterizes users’ interest using latent factors which deeply reflect and model interaction relationships between users and items, but KNN-based methods only rely on rating information, which is less informative. The performance of D-LDA is superior to that of E-LDA, which is straightforward since we use the dynamic window size, which alleviates the data sparsity issue and avoids the error caused by data imputation. TimeSvd++ generally outperforms KNN-based methods because similar to LDA-based methods, TimeSvd++ exploits the ’latent structure’ in users’ feedback information, which better capture users’ preference. Although TimeSvd++ is consistently outperformed by our D-LDA, it is similar to E-LDA (with slight improvement), because in both models, users’ interaction histories is divided into sessions with fixed temporal window size.
We also show MAE and RMSE for rating prediction in Fig. 9. The results have similar trends with precision, recall and F-score. We observe that MAE and RMSE of E-LDA, D-LDA and TimeSvd++ are smaller than that of TItemKNN and TUserKNN. Our approach D-LDA further outperforms E-LDA and TimeSvd++. Although the results on different datasets slightly vary, the general tendency is consistent.
Conclusions
In this paper, we first provide an analysis of users’ behavior patterns in real-world movie recommendation systems to demonstrate that traditional session-based recommendation models that rely on fixed-sized time windows suffer from serious data sparsity. To handle this issue, we propose a novel method that automatically divides a user’s interaction history into sessions with variable sizes by using dynamic clustering. LDA is then used to model the user’s preference in different sessions to capture his dynamic preference. The proposed method is instantiated for two major collaborative filtering tasks, i.e., top-N recommendation and rating prediction. The experimental results on two real datasets show that our method can not only achieve better recommendation performance, but also tackle the sparsity problem.
As for future work, we intend to build the temporal association graph on users and items according to the relevance of their probabilistic topic distributions, and employ personalized random walk to adaptively select new items. In this way, our model can not only capture users’ preference drifting over time, but also increase the novelty of recommendation.
Footnotes
Acknowledgments
This work was supported by the National Natural Science Foundation of China (U1533104; 61301245), Tianjin Natural Science Foundation (No. 14JCZDJC32500), Hebei Province Natural Science Foundation (No. E2016202341), Research Project of Science and Technology for Hebei Province Higher Education Institutions (No. BJ2014013) and the Fundamental Research Funds for the Central Universities (ZXH2012P009).
