User and item profile expansion for dealing with cold start problem

Abstract

Recommender Systems (RS) are expected to suggest the accurate goods to the consumers. Cold start is the most important challenge for RSs. Recent hybrid RSs combine ConF and ColF. We introduce an ontological hybrid RS where the ontology has been employed in its ConF part while improving the ontology structure by its ColF part. In this paper, a new hybrid approach is proposed based on the combination of demographic similarity and cosine similarity between users in order to solve the cold start problem of new user type. Also, a new approach is proposed based on the combination of ontological similarity and cosine similarity between items in order to solve the cold start problem of new item type. The main idea of the proposed method is to expand user/item profiles based on different strategies to build higher-performing profiles for users/items. The proposed method has been evaluated on a real dataset and the experimentations indicate the proposed method has the better performance comparing with the state of the art RS methods, especially in the case of the cold start.

Keywords

Recommender system hybrid recommender system ontology profile expansion

1 Introduction

Recommender Systems (RS) that predict user ratings for a set of items (books, movies, news, songs, etc.) [1, 2] are known as a subset of Information Filtering Systems (IFS). RSs help users find their favorite items among thousands of the existing items. In fact, the reason behind the success of advocate systems on commercial websites is to personalize recommendations to users [3]. Collaborative Filtering (ColF) is one of the most commonly used methods for implementing a RS [4]. Rating values assigned to the different items by a user is called Items’ Rating Vector (IRV). IRV is a vector of the size which is equal to the number of the items. Any item which is not rated by the user, has a Null value in its corresponding element of IRV. In this way, the similar users (those users with an IRV identical to the IRV of the target user) are known as the target user’s neighbors. The target user is referred to as the user who is to be recommended by the system. Therefore, a ColF RS offers a list of favorite N items among the items ranked by the neighbors to the target user. ColF RSs are divided into memory-based and model-based approaches [5]. The ColF RSs directly use the user’s matrix-item to calculate predictions and generate recommendations for the target user [6]. These methods use statistical techniques such as Pearson’s similarity [7] and cosine similarity [8] in order to calculate the similarity between users and to find the target user’s neighbor sets [9]. In a model-based ColF, strategy is to use machine learning techniques, such as clustering [10], Bayesian classifier [11] and genetic algorithm [12], to create a model of training data in the user-item matrix. In addition, in the test phase, this training model can be used to predict user rank.

One of major challenging problems that RSs suffer from is the data sparsity problem. It means due to sparsity of data in the system, they are unable to find popular items with certainty and accurately [13]. Indeed, this emerged specially in situations where there are a great number of items and users in the system and the items’ rating ratio by users is low; (in other words, the user-item matrix is a sparse one). Therefore, finding the meaningful nearest neighbors of a user is impossible. Another challenging problem related to RSs is possibility presence of users with abnormal interests. The users with abnormal interests can mislead RS to recommend unsuitable items to other users.

The user-item matrix is a matrix of size n × m where n indicates the number of users m indicates the number of items. The (i, j)th entry of this matrix is the rating that the ith user assigns to the jth item. Number of items rated by a user is about less than 1 percent of all items in a system. Consequently, the user-item matrix is a sparse matrix with less than 1 percent of data. This problem can be solved in hybrid filtering RSs to some extent. Gone et al. propose a model where the null entries in the user-item matrix are filled as a preprocessing step using a multi-layer perceptron neural network. They have used an item-wise similarity method [14]. A solution for dealing with the sparsity problem is dimension reduction approach [15]. For example, a SVD (Singular Value Decomposition) along with a PSO (Particle Swarm Optimization) has been applied for this material. It is worth mentioning that SVD has a high overhead. Therefore, in spite of its effective results, the SVD cannot be used in online RSs. PSO is used there to speed up the SVD run time.

Another challenging problem from which RSs suffer is related to their scalability. When there are about several hundreds of thousands of users, user based ColF RSs are efficient and scalable. But in real systems, there are more than millions of users. therefore, these user based ColF RSs are no longer scalable and useful. Item based ColF RSs can be considered as an alternative.

One of the most important problems in the RSs is the Cold Start (CS) one [16]. CS problem occurs due to the low number of user-ranked items, i.e. the sparsity of IRV of the users. The CS problem is divided into two categories: New User (NU) and New Item (NI) [17]. Our main focus in this article is on CS problem of the NU type. The NU problem occurs when a new user has recently logged in and has not ranked any items so far or when a user has been already present in the system, but has been less active and has consequently rated a few number of items so far.

So far, a lot of work has been done to solve the CS problem in the RSs. The solutions to the CS problem are generally divided into two groups. A group of solutions is the ones dealing with it through improving traditional RSs so as to make them ready to deal with CS problem [18]. Another group of solutions is the ones dealing with it through using hybrid RSs [19]. New users are the ones that initially rate a few number of items or the ones that still rate no items; it is worthy to be mentioned that they have an almost empty profile (or a very small profile). Therefore, the RSs are not able to recognize their preferences and cannot offer them the appropriate items of their interest. Hence, most researchers have used hybrid RSs to solve the NU problem [20]. Hybrid RSs are usually a combination of ColF with additional data sources.

So far, a great deal of research has been done on the use of additional data sources such as demographic information [21] of the users to solve the CS problem. As an example, it is proposed to implement collaborative tagging in a collaborative filtering RS to learn the users’ interests and classify items based on the users’ interests [22]. Another technique to deal with the mentioned problems usually used as a solution to CS problem in RSs is clustering [41 –46]. Clustering task has been successfully used in many applications such as bioinformatics [47, 48], healthcare systems [49], optimization [50, 51], and domain adaptation [52, 53]. It is possible to cluster items, users or simultaneously both of them. These techniques can speed up the execution of RSs.

A new method for solving the CS problem, which consists of three different phases, has been investigated [23]. In the first phase, using the classification algorithms, the new user is placed in a specific group. In the second phase, the similarity between the new user and other users in the target group is calculated based on demographic data of the users. In the third phase, using the different groups created for users, the rating is predicted for the new user. A hybrid method for solving the CS problem is presented using user-based content and social information [24]. The main idea of this method is to build content information based on the profile of the keywords associated with different items and use this information in order to generate a recommendation for the new users. Different ways to the user profiling based on demographic information has been investigated by Al-Shamri [25]. These methods incorporate various combinations that include the types of used features, types to features’ presentation, and users’ profiling mechanisms. Another method based on demographic information is presented to solve the CS problem for the new users by Safoury and Salah [26]. In this method, instead of using the rates given to various items by cold users, their demographic information is used to generate recommendations. To this end, a framework has been proposed to evaluate demographic characteristics such as age, gender and occupation. In another demographic user-driven RS, users are provided with a solution to the CS problem [27]. In this way, different users are categorized based on their demographic data, and then, based on their demographic categories, recommendations are produced. Therefore, using demographic information of users, the CS problem is largely solved.

Contrarily to the above mentioned methods, other methods have not used additional data sources to solve the CS problem of NU type. They only try to improve the CS problem with the current status of the user rating profile. A framework based on a local similarity criterion of the users and also a global similarity criterion of the users tries to improve the CS problem [28]. In addition, various types of data sparsity, such as Overall Sparsity Measure, User Specific Sparsity Measure, User-Item Specific Sparsity Measure, and Unified Measure of Sparsity have been used to solve the data sparsity problem and the CS issue. In this method, in order to predict the rank of a specific item for a given user, the combination of rankings generated based on the local and global similarities of users taking into account the uniform distribution of the users ranking matrix is used. Ahn [29] has focused on the limitations of traditional similarity criteria such as the cosine and Pearson similarities, and he has provided a new exploratory similarity criterion called Proximity–Impact–Popularity (PIP) to deal with the CS problem. This similarity criterion includes three factors: (I) proximity factor (indicating the distance between two rating), (II) impact factor (indicating the severity of the hate or the interest of the users in the given item) and the popularity factor (indicating the distance between the average rating of the two users to the given item from the average rating of the total users to the given item).

In order to cope with the problem of choosing a weak neighboring set for a new user in a ColF RS, Formoso et al. propose to expand profiles of its new users based on Item-Global, Item-Local, and User-Local techniques [30]. Inspiring from [35], the ontology has been considered as an additional guideline for expanding profiles; but we expand profiles for both of users and items. The Item-Global technique tries to find a collection of the most similar items to the ones already in the user profile and adds them to the user profile. The Item-Local technique consists of two steps. In the first step, according to the user’s current profile, the system recommends some of the items, and then the items with the highest grades will be added to the user profile. In the second step, the system generates items’ recommendations to the new user. The User-Local technique based on the current neighbors’ set of the target user expands the profile. In this technique, among the items rated by the current neighbors’ set of the target user, a subset is selected based on strategies such as the Local Most-Rated Strategy (LMRS) and Local User-Local Clustering Strategy (LULCS). It is worthy to be mentioned that [31] proposed LULCS. Due to lack of a RS that incorporates all solutions to both types of CS problem, and also lack of a RS that uses both of additional information and new appropriate similarity measure to handle CS problem, we try to propose a method dealing with these challenges. Therefore, in this paper, a new hybrid method based on the profile extension technique is presented to solve a CS problem of the types NU and NI. In the proposed method, in contrast to the previous methods [25] which use only information about the matrix of the ratings given to the items by users, demographic information of users is also used. In fact, in the proposed method, combinations of cosine similarity based on rating matrix and demographic similarity based on demographic information of users are used as the final similarity in the development of user profiles.

Algorithm 1. The general framework of the proposed method

Input:

Let’s U_i . DI denotes demographic information of the ith user.

Let’s R_ij denotes the rate value the ith user gives to the jth movie. Let’s R_i: denotes the rate values the ith user gives to the different movies; its transpose (i.e. $R_{i :}^{T}$ ) can be seen as a rate vector of the ith user. Let’s R_:j denotes the rate values the different users give to the jth movie.

Let’s I_i . L denotes the language of the ith movie.

Let’s I_i . D denotes the director of the ith movie.

Let’s I_i . W denotes the writer of the ith movie.

Let’s I_i . R denotes the average rate the ith item has gotten.

Let’s I_i . Rt denotes the runtime of the ith item.

Let’s I_i . Rd denotes the release data of the ith item.

Let’s I_i . C denotes the country of the ith item.

Let’s I_i . N denotes the number of rates the ith item has gotten.

Let’s I_i . P denotes the producer of the ith item.

Let’s I_i . G_j is an asymmetric Boolean variable indicating whether the ith item has the jth genre or not.

Let’s I_i . A_k denotes whether the kth famous actor is available in the actors’ list of the ith movie or not. A famous actor is the one features in at least five movies. I_i . A_k is an asymmetric Boolean variable.

Let l be expanding size of user profile or item profile

Output:

R matrix

Body:

Z = Q = 25

N_LARS = N_GMRS = N_GMRS = l

${US}_{ij}^{D} = Sim (U_{i} . DI, U_{j} . DI)$ where ${US}_{ij}^{D}$ is the users’ similarity matrix defined based on users’ demographic information.

${US}_{ij}^{R} = \frac{\sum_{k \in A_{ij}} R_{ik} R_{jk}}{\sqrt{\sum_{k \in A_{ij}} R_{ik}^{2} \sum_{k \in A_{ij}} R_{jk}^{2}}}$ where ${US}_{ij}^{R}$ is the users’ similarity matrix defined based on users’ ratings and A_ij ={ k|R_jk ≠ NaN ∧ R_ik ≠ NaN }

${IS}_{ij}^{R} = \frac{\sum_{k \in B_{ij}} R_{ki} R_{kj}}{\sqrt{\sum_{k \in B_{ij}} R_{ki}^{2} \sum_{k \in B_{ij}} R_{kj}^{2}}}$ where ${IS}_{ij}^{R}$ is the items similarity matrix defined based on users’ ratings and B_ij ={ k|R_ki ≠ NaN ∧ R_kj ≠ NaN }.

Let’s ${IS}_{ij}^{O}$ be the items’ similarity matrix defined based on items’ ontology information (features defined in lines 3-13 of input section).

Define ${US}_{ij} = θ_{u} (| A_{ij} |) \times {US}_{ij}^{D} + (1 - θ_{u} (| A_{ij} |)) \times {US}_{ij}^{R}$ .

Define ${IS}_{ij} = θ_{i} (| B_{ij} |) \times {IS}_{ij}^{O} + (1 - θ_{i} (| B_{ij} |)) \times {IS}_{ij}^{R}$ is the items’ similarity matrix defined based on items’ ontology information.

Use LARS or GMRS or LARS+GMRS to determine $R_{\ddot{i} \ddot{j}}$ and then expand user profile of the target user if the target user is NU

Use ILMRS to determine $R_{\ddot{i} \ddot{j}}$ and then expand the target item profile if the target item is a NI

If $\ddot{j}$ th item is a cold item or NI

Use ILMRS to determine $R_{\ddot{i} \ddot{j}}$ on the expanded profiles of items

Elseif $\ddot{i}$ th user is a cold user or NU

Estimate $R_{\ddot{i} \ddot{j}}$ using Equation 6 on the expanded profiles of users either by GMRS or LARS or LARS+GMRS

Else

Estimate $R_{\ddot{i} \ddot{j}}$ using ILMRS on the non-expanded profiles of items and with GMRS or LARS or LARS+GMRS on the non-expanded profiles of users and then report average of the values obtained by two methods

2 Related works

According to the literature, in collaborative filtering, RS scores the items based on ratings of the other users, and then recommends the items with the maximum score values to the current user. Therefore, in the start of the system, as there is no item-rating yet, RS faces a challenging problem to predict the high-quality scores for items. This problem is called cold start. To be more general, when a new item is introduced to the system, RS faces item cold start and that item is a NI, and when a new user is introduced to the system, RS faces user cold start and that user is a NU [32]. NUs are those that rate less than 5 items generally [33]. To address the CS, two approaches are used by researchers: (a) to use supplementary information of users or items for assisting RS, and (b) to use more appropriate similarity measures so as to maximally benefit from the prior information in the system. Shaw et al. propose to use association rules as a source of information to expand profile of a user to handle cold start problem of NU type [34]. Liu et al. propose a model in which the underlying user behavioral models can be attained through the invisible interests of the users. They propose a RS that learns to use this information; i.e. the information about invisible interests of the users [35]. Considering all related works, there is not any study that uses both of user profile expansion and item profile expansion. It may be due to its need to auxiliary information for users and items. However, this paper addresses this issue, and consequently, it overcomes the cold start problem with both types of NU and NI.

3 Proposed RS

In this section, a new method is proposed to solve the CS problem in the RSs. In the proposed system, the recommendation process consists of two phases. In the first phase, the target user profile and the target item profile are expanded. For this purpose, the combination of cosine similarity and demographic similarity are used as the final combined similarity for selecting the nearest neighbors of the target user to expand its profile. Therefore, in the proposed method, in addition to information about ratings given to various items by users, their demographic information is also used as additional information to solve the CS problem. After expanding the user profiles and the item profiles, in the second phase, the targeted items are rated and ranked for the target user based on its expanded profile if the target user is NU; otherwise if the target item is NI, the targeted item is rated based on expanded profile of the item. In the following subsections, each phase will be described in detail. Figure 1 shows scheme of the proposed method. Also, algorithm 1 shows pseudo code of the proposed method.

Fig. 1

The scheme of the proposed method.

3.1 Local averaged rating strategy

At this point, a hybrid approach is developed based on cosine and demographic similarities between users to extend user profiles. The information in the user rating matrix is used to calculate cosine similarity. Also, the demographic information of users (including age, gender, and occupation) is used to calculate demographic similarity among users. In order to analyze the CS problem, the proposed method is used. In this method, for each evaluated pair of users, all of items rated by both of them are selected, then, cosine similarity criterion is used to calculate the similarity between those users as presented in Equation 1. ${US}_{ij}^{R} = \frac{\sum_{k \in A_{ij}} R_{ik} R_{jk}}{\sqrt{\sum_{k \in A_{ij}} R_{jk}^{2}} \sqrt{\sum_{k \in A_{ij}} R_{jk}^{2}}}$ (1)

In the Equation 1, ${US}_{ij}^{R}$ is the cosine similarity between a pair of the ith and jth users. The minimum and maximum values of ${US}_{ij}^{R}$ are 0 and 1 respectively. R_ij denotes the rate value the ith user gives to the jth movie. A_ij is the set of indices of those items which are ranked by both the ith and jth users and is defined as follows A_ij ={ k|R_jk ≠ NaN ∧ R_ik ≠ NaN } where NaN is unknown.

The similarity of users is also calculated based on demographic data (i.e. age, gender, and occupation). Demographic similarities are calculated between two users according to Equation 2. In this regard, if the value of a particular nominal attribute is the same for two users, the value of the similarity of this particular attribute is equal to one for both users; otherwise, it is equal to zero for them. For example, if both users have the same gender, the value is one, otherwise it is zero.

$\begin{matrix} {US}_{ij}^{D} & = Sim (U_{i} . DI, U_{j} . DI) \\ = \frac{\sum_{k = 1}^{d} w_{k} \times π (U_{i} . f_{k}, U_{j} . f_{k})}{\sum_{k = 1}^{d} π (U_{i} . f_{k}, U_{j} . f_{k})} \end{matrix}$ (2)

In this regard, ${US}_{ij}^{D}$ is demographic similarity between a pair of the ith and jth users. The minimum and maximum values of ${US}_{ij}^{R}$ are 0 and 1 respectively. U_i . f_k is the kth demography feature of the ith user. The total number of demography features is denoted by d. w_k stands for weight of the kth demography feature in calculating demographic similarity. Also, π (A, B) is a function measuring the similarity between A and B and is defined based on Equation 3.

$π (U_{i}, f_{k}, U_{j}, f_{k}) = {\begin{matrix} 0 & (f_{k} is nominal) \land (U_{i}, f_{k} \neq U_{j}, f_{k}) \\ 1 & (f_{k} is nominal) \land (U_{i}, f_{k} = U_{j}, f_{k}) \\ \frac{U_{i}, f_{k} - U_{j}, f_{k}}{max_{i_{1}} U_{i_{1}}, f_{k} - min_{i_{1}} U_{i_{1}}, f_{k}} & f_{k} is numeric \end{matrix}$ (3)

After calculating cosine and demographic similarities among users, Equation 4 is used to calculate the final similarity between users.

$\begin{matrix} {US}_{ij} = & θ_{u} (| A_{ij} |) \times {US}_{ij}^{D} \\ + (1 - θ_{u} (| A_{ij} |)) \times {US}_{ij}^{R} \end{matrix}$ (4) where θ_u (|A_ij|) is a parameter determining the dependence of the final similarity between the users to each of the cosine and demographic similarities. In this regard, the value of θ_u (|A_ij|) = 1 indicates a complete dependence of the final similarity to the demographic similarity of users, and the value of θ_u (|A_ij|) = 0 indicates a complete dependence of the final similarity to the cosine similarity of the users. Regarding the relationship, it can be concluded that even if both users do not have any shared ranked items, their similarity with each other can be calculated on the basis of the demographic similarity criterion. Therefore, we define θ_u (n) based on Equation 5.

$θ_{u} (n) = {\begin{matrix} min (\frac{n}{n + 10}, 0.2) & n < 4 \\ min (\frac{n}{n + 100} + 0.2, 1.0) & n ⩾ 4 \end{matrix}$ (5)

After calculating the final similarity between users, Z users who have the most similarity to the target user are selected as the set of closest target user neighbors. The predicted rate value which the ith user (the target user) gives to the jth item is estimated according to Equation 6.

${\hat{R}}_{ij} = \frac{\sum_{k \in {NN}_{i} (Z)} {US}_{ik} \times R_{kj}}{\sum_{k \in {NN}_{i} (Z)} {US}_{ik}}$ (6)

${\hat{R}}_{ij}$ is the predicted rate value which the ith user (the target user) gives to the jth item, NN_i (Z) indicates the indices the Z nearest neighbors of the target user, i.e. ith user. Then, a certain number of items highly ranked by the closest users to the target user (i.e. its neighbors) are added based on Equation 6. We name it the Local Averaged Rating Strategy (LARS). Therefore, N_LARS (= Z) different items are selected to expand the target user profile. For this purpose, the items that have the highest rating by Z neighbors of the target user are selected. Therefore, in this strategy, the existing items are sorted in descending order based on the predicted rating they received, and the N_LARS items from the beginning of the list are selected to expand the target user profile.

3.2 Global most rated strategy

Global Most Rated Strategy (GMRS) is a strategy where N_GMRS different items are selected to expand the target user profile [25]. For this purpose, the items that have the highest number of ratings by all the Z nearest neighbors of the target user are selected. Therefore, in this strategy, the existing items are sorted in descending order based on the average number of ratings they received by the Z nearest neighbors of the target user, and the N_GMRS items from the beginning of the list are selected to expand the target user profile.

3.3 Item-wise local most rated strategy

In this strategy, the similarities between the target item and the other items in the system are calculated. Then, some of the most similar items are selected for prediction of the target item rating. A hybrid approach is developed based on cosine and ontological similarities between items. The information in the user rating matrix is used to calculate cosine similarity between items. Also, the auxiliary (ontological) information of items is employed to calculate a new (ontological) similarity among users. In this method, for each evaluated pair of items, all of users rating both of them are selected, then, cosine similarity criterion is used to calculate the similarity between those users as presented in Equation 7.

${IS}_{ij}^{R} = \frac{\sum_{k \in B_{ij}} R_{ki} R_{kj}}{\sqrt{\sum_{k \in B_{ij}} R_{ki}^{2} \sum_{k \in B_{ij}} R_{kj}^{2}}}$ (7)

In Equation 7, ${IS}_{ij}^{R}$ is the cosine similarity between a pair of the ith abd jth items. The minimum and maximum values of ${IS}_{ij}^{R}$ are 0 and 1 respectively. B_ij is the set of indices of those users which give a rating to both of the ith and jth items and is defined as follows B_ij ={ k|R_ki ≠ NaN ∧ R_kj ≠ NaN } where NaN is unknown.

The similarity of items is also calculated based on auxiliary data. Let’s I_i. L denotes the language of the ith movie. Let’s I_i . D denotes the director of the ith movie. Let’s I_i . W denotes the writer of the ith movie. Let’s I_i . R denotes the average rate the ith item has gotten. Let’s I_i.Rt denotes the runtime of the ith item. Let’s I_i.Rd denotes the release data of the ith item. Let’s I_i . C denotes the country of the ith item. Let’s I_i.N denotes the number of rates the ith item has gotten. Let’s I_i.P denotes the producer of the ith item. Let’s I_i . G_j is an asymmetric Boolean variable indicating whether the ith item has the jth genre or not. Let’s I_i . A_k denotes whether the kth famous actor is available in the actors’ list of the ith movie or not. A famous actor is the one features in at least five movies. I_i . A_k is an asymmetric Boolean variable. Let’s ${IS}_{ij}^{O}$ be the items’ similarity matrix defined based on items’ auxiliary (ontology) information (features defined in this paragraph). Ontological similarities are calculated between all pairs of items according to Equation 8.

$\begin{matrix} {IS}_{ij}^{O} & = Sim (I_{i} . O, I_{j} . O) \\ = \frac{\sum_{k = 1}^{o} ω_{k} \times π (I_{i} . f_{k}, I_{j} . f_{k})}{\sum_{k = 1}^{o} π (I_{i} . f_{k}, I_{j} . f_{k})} \end{matrix}$ (8)

In this regard, ${IS}_{ij}^{O}$ is new (ontological) similarity between a pair of the ith and jth items. The minimum and maximum values of ${IS}_{ij}^{R}$ are 0 and 1 respectively. The total number of auxiliary features is denoted by o. ω_k stands for weight of the kth auxiliary feature in calculating the new (ontological) similarity. Also, p(A,B) is a function measuring the similarity between A and B and is defined like Equation 3 based on Equation 9.

$π (I_{i}, f_{k}, I_{j}, f_{k}) = {\begin{matrix} 1 & (f_{k} is Asy Bool) \land (I_{i} \cdot f_{k} = I_{j} \cdot f_{k} = 1) \\ 0 & f_{k} is Asy Bool \\ 0 & (f_{k} is nominal) \land (I_{i} \cdot f_{k} \neq I_{j} \cdot f_{k}) \\ 1 & (f_{k} is nominal) \land (I_{i} \cdot f_{k} = I_{j} \cdot f_{k}) \\ \frac{I_{i} \cdot f_{k} - I_{j} \cdot f_{k}}{{max}_{i_{1}} I_{i_{1}} \cdot f_{k} - {min}_{i_{1}} I_{i_{1}} \cdot f_{k}} & f_{k} is numeric \end{matrix}$ (9)

After calculating cosine and new (ontological) similarities among items, Equation 10 is used to calculate the final similarity between items.

$\begin{matrix} {IS}_{ij} = & θ_{i} (| B_{ij} |) \times {IS}_{ij}^{O} \\ + (1 - θ_{i} (| B_{ij} |)) \times {IS}_{ij}^{R} \end{matrix}$ (10) where θ_i (|B_ij|), like θ_u (|A_ij|), is a parameter determining the dependence of the final similarity between the items to each of the cosine and new (ontological) similarities. θ_i (|B_ij|) is defined based on Equation 11.

$θ_{i} (n) = min (\frac{n}{n + 1000}, 0.9)$ (11)

After calculating the final similarity between items, Q items who have the most similarity to the target item are selected and used to the rating prediction. Now, Q items who have the most similarity to the target item are selected as the set of closest target item neighbors. The predicted rate value which the ith user gives to the jth item (the target item) is estimated according to Equation 12.

${\hat{R}}_{ij} = \frac{\sum_{k \in {NN}_{u} (Q)} {IS}_{jk} \times R_{ik}}{\sum_{k \in {NN}_{u} (Q)} {IS}_{jk}}$ (12)

${\hat{R}}_{ij}$ is the predicted rate value which the ith user gives to the jth item, i.e. the target user, NN_u(Q) indicates set of all the indices of the Q nearest neighbors of the target item. We name this strategy the Item-Wise Local Most Rated Strategy (ILMRS). Therefore, N_ILMRS _(= Q) different items are selected for the rating prediction. For this purpose, any jth item that is among the N_ILMRS highest rating according to IS_ij is added to the item profile.

4 Experimental study

4.1 Benchmark datasets

In order to evaluate the proposed method, the Movielens 1m dataset, which can be downloaded online from the lens group website, has been used. The collection consists of 1,000,000 ratings, which are given by 6,040 users to 3,950 videos. Existing ratings include a 5-point numerical scale, a rating of 1 indicates a very low interest, a rating of 2 indicates low interest, a rating of 3 indicates an average interest, a rating of 4 indicates a high interest, and a rating of 5 indicates a very high interest of users. In this dataset, each user has ranked at least 20 items. A tenth of videos, i.e. 395 videos, are used as cold items. A cold item is the one which gets at most 20 ratings. We randomly put 0 to 20 real ratings for each of these cold items in the dataset. We name it as ICS (Item Cold Start) dataset.

A tenth of users, i.e. 604 users, are used as cold users. We name it as UCS (User Cold Start) dataset. In the dataset used to evaluate the proposed method, each user has rated at least 20 items. Therefore, in order to evaluate the proposed method in cold start conditions (i.e. UCS dataset), a number of items are randomly selected based on the method proposed for each user. This number of items selected is less than the total number of items rated by each user. For each user, the number of initial items in the profile is equal to different values up to 10.

A tenth of ratings are also used as general test set in a different testbed. We name it as TN (Traditional Non-cold start) dataset. It is noteworthy to be mentioned that ICS (or UCS or TN) is randomly produced 30 times and consequently, we conduct 30 independent experiments on ICS (respectively on UCS or TN) for each method and then the averaged result on these 30 independent experiments is considered as the performance of the method.

4.2 Evaluation criteria

To evaluate the proposed method, the Mean Absolute Error (MAE) criterion is calculated using Equation 13.

$MAE = \frac{\sum_{i} \sum_{j} | R_{ij} - {\hat{R}}_{ij} |}{TestSize}$ (13)

In Equation 13, R_ij is the actual rating of the ith user to the jth item, ${\hat{R}}_{ij}$ is the predicted rating of the ith user to the jth item, and TestSize is the total number of predicted ratings. Also, the Rating Correction (RC) criterion is calculated using Equation 14.

$RC = \frac{\sum_{i} \sum_{j} δ (R_{ij} = = round ({\hat{R}}_{ij}))}{TestSize}$ (14) where d(A) is one if A is true; otherwise it is zero.

A significance test [36] is a statistical method to validate that the difference between performances of two or more competent methods is statistically valid at a (1-p)-level of confidence and it is due to chance at a (1-p)-level of confidence. The significance statistical test can be done on different evaluation criteria. The term “at a p-level of confidence” means “with the probability of p’’.

4.3 Baselines and experimental setting

In order to evaluate the proposed method, we compare it with some basic raw methods such as LMRS, LULCS, GMRS, LMRS+ (LMRS using demographic information), as well as classical user-based collaborative filtering (or no-profile expansion abbreviated by NoPE). Also, we use some more different methods such as Non-Normalized ConF RS [37], Singular Value Decomposition based RS (denoted by SVD) [38], Popularity based RS (denoted by Pop) [39], and Ontology-based Top-N RS using Matrix Factorization (denoted by OTopN) [40] as the state of the art baselines. All these methods use their default parameters by their papers. The tests have been differently performed on the three datasets: ICS, UCS and TN.

In addition, the weight of each demographic data of age, sex, and occupation is tuned by a validation set. Also, the weight of each ontological information of movies is tuned by a validation set. Z and Q parameters are set to 25 as it is experimentally the best choice [36].

4.4 Experimental results

The methods used in the experiments are compared according to the two criteria: (a) MAE and (b) RC. It should be noted that the proposed method is named according to the type of strategy used to expand the profile, here we name it ILMRS & LAMR+GMAR headings. Figure 2 shows the results for the three benchmarks and MAE criteria for different methods. As it is clear from the results of Fig. 2, the proposed method has the best performance in almost all cases. In this section, different methods are based on the two criteria of MAE and RC. As it is clear from the results of Fig. 3, the proposed method has the best performance in almost all cases in terms of accuracy. The results in Figs. 2 and 3 show that the proposed method has the best performance on the basis of RC and MAE criteria on all benchmarks. It is more dominant in ICS benchmark. As shown in Fig. 2, the best MAE value for the different RS methods is obtained when the parameter value l = 10. Also, the best RC for the different RS methods is obtained when the parameter value l = 50. The best RC for the proposed RS method is obtained when the parameter value l ≥ 10. Therefore, according to the results in Figs. 2 and 3, the best profile expansion size is 10. Therefore, the profile expansion size will be 10 from here on, i.e. l = 10.

Fig. 2

The performance comparison of different RSs in terms of MAE for different profile expansion sizes on a) (top-left) TN dataset, b) (top-right) ICS dataset and c) (bottom) UCS dataset.

Fig. 3

The performance comparison of different RSs in terms of RC for different profile expansion sizes on a) (top-left) TN dataset, b) (top-right) ICS dataset and c) (bottom) UCS dataset.

Figures 4 and 5 show respectively MAE and RC values for different RS methods. The significance statistical test done on the results of Figure 5 indicates p value is 0.0375. Figure 6 depicts accuracy of Top-N recommendations presented by different RS methods on ICS benchmark.

Fig. 4

The performance comparison of different RSs in terms of MAE for profile expansion size of 10 on a) (top-left) TN dataset, b) (top-right) ICS dataset and c) (bottom) UCS dataset.

Fig. 5

The performance comparison of different RSs in terms of RC for profile expansion size of 10 on a) (top-left) TN dataset, b) (top-right) ICS dataset and c) (bottom) UCS dataset.

Fig. 6

The performance comparison of different RSs in terms of recall of Top-N recommendations for profile expansion size of 10 on ICS dataset.

5 Conclusions and future works

In this paper, a new hybrid approach is proposed based on the combination of demographic similarity and cosine similarity between users in order to solve the cold start problem of new user type. Also, a new approach is proposed based on the combination of ontological similarity and cosine similarity between items in order to solve the cold start problem of new item type. The main idea of the proposed method is to expand user profiles based on different strategies to build higher-performing profiles for users. The results from the experiments show better performance of the proposed method than other methods. One of the suggestions that can be considered for future work is to use different information about the content of items and users. Using this additional information related to items and users can increase the efficiency of the recommender systems, especially in the case of the cold start.

So far, we discussed several solutions to cold start problem in recommender systems. However, using item ontology, user ontology, semantic similarity improved by WordNet can be the future guidelines for research.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

References

Yera

and Martinez

, Fuzzy tools in recommender systems: A survey, International Journal of Computational Intelligence Systems 10 (2017), 776–803.

Ekstrand

M.D.

and Konstan

J.A.

, Recommender Systems Notation: Proposed Common Notation for Teaching and Research, arXiv preprint arXiv:1902.01348, 2019.

Doja

, Recommender System for Personalized Adaptive E-learning Platforms to Enhance Learning Capabilities of Learners Based on their Learning Style and Knowledge Level, in International Conference on Sustainable Computing in Science, Technology & Management (SUSCOM-2019) Amity University Rajasthan, Jaipur, India, 2019, pp. 1397–1402.

Hameed

M.A.

, Al Jadaan

and Ramachandram

, Collaborative filtering based recommendation system: A survey, International Journal on Computer Science and Engineering 4 (2012), 859.

M.-P.T.

, Nguyen

D.V.

and Nguyen

, Model-based Approach for Collaborative Filtering, in 6th International Conference on Information Technology for Education (IT@EDU2010), Ho Chi Minh city, Vietnam, 2010, pp. 217–228.

Yang

, Xu

, Wang

, Han

and Yu

, Improving existing collaborative filtering recommendations via serendipity-based algorithm, IEEE Transactions on Multimedia 20 (2017), 1888–1900.

Raghuwanshi

S.K.

and Pateriya

, Collaborative Filtering Techniques in Recommendation Systems, in Data, Engineering and Applications, ed: Springer, 2019, pp. 11–21.

Duong

T.N.

, Than

V.D.

, Tran

T.H.

, Dang

Q.H.

, Nguyen

D.M.

and Pham

H.M.

, An Effective Similarity Measure for Neighborhood-based Collaborative Filtering, in 2018 5th NAFOSTED Conference on Information and Computer Science (NICS), 2018, pp. 250–254.

Feng

, Fengs

, Zhang

and Peng

, An improved collaborative filtering method based on similarity, PloS One 13 (2018), e0204003.

10.

Thakkar

, Varma

, Ukani

, Mankad

and Tanwar

, Combining User-Based and Item-Based Collaborative Filtering Using Machine Learning, in Information and Communication Technology for Intelligent Systems, ed: Springer, 2019, pp. 173–180.

11.

Yang

, Fu

, Lin

, Peng

and Tang

, Collaborative Filtering Recommendation Algorithm Based on AdaBoost-Naïve Bayesian Algorithm, in International Conference on Human Centered Computing, 2018, pp. 380–392.

12.

Neysiani

B.S.

, Soltani

, Mofidi

and Nadimi-Shahraki

M.H.

, Improve Performance of Association Rule-Based Collaborative Filtering Recommendation Systems using Genetic Algorithm, International Journal of Information Technology and Computer Science 11 (2019), 48–55.

13.

Borràs

, Moreno

and Valls

, Intelligent tourism recommender systems: A survey, Expert Systems with Applications 41 (2014), 7370–7389.

14.

Gong

and Ye

, An item based collaborative filtering using bp neural networks prediction, in 2009 International Conference on Industrial and Information Systems, 2009, pp. 146–148.

15.

Abdelwahab

, Sekiya

, Matsuba

, Horiuchi

, Kuroiwa

and Nishida

, An efficient collaborative filtering algorithm using SVD-free Latent Semantic Indexing and particle swarm optimization, in 2009 International Conference on Natural Language Processing and Knowledge Engineering, 2009, pp. 1–4.

16.

Viktoratos

, Tsadiras

and Bassiliades

, Combining community-based knowledge with association rule mining to alleviate the cold start problem in context-aware recommender systems, Expert Systems with Applications 101 (2018), 78–90.

17.

, Zhao

, Liu

, Huang

, Mei

and Chen

, Learning from history and present: Next-item recommendation via discriminatively exploiting user behaviors, in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018, pp. 1734–1743.

18.

Logesh

, Subramaniyaswamy

, Malathi

, Sivaramakrishnan

and Vijayakumar

, Enhancing recommendation stability of collaborative filtering recommender system through bio-inspired clustering ensemble method, Neural Computing and Applications, pp. 1–24, 2019.

19.

Qian

, Zhang

, Ma

, Yu

and Peng

, EARS: Emotion-aware recommender system based on hybrid information fusion, Information Fusion 46 (2019), 141–146.

20.

Mohammadpour

, Bidgoli

A.M.

, Enayatifar

and Javadi

H.H.S.

, Efficient clustering in collaborative filtering recommender system: Hybrid method based on genetic algorithm and gravitational emulation local search algorithm, Genomics 111, 2019.

21.

Valdiviezo-Díaz

and Bobadilla

, A Hybrid Approach of Recommendation via Extended Matrix Based on Collaborative Filtering with Demographics Information, in International Conference on Technology Trends, 2018, pp. 384–398.

22.

Batet

, Moreno

, Sánchez

, Isern

and Valls

, Turist@: Agent-based personalised recommendation of tourist activities, Expert Systems with Applications, 39 (2012), 7319–7329.

23.

Kotkov

, Konstan

J.A.

, Zhao

and Veijalainen

, Investigating serendipity in recommender systems based on real user feedback, in Proceedings of the 33rd Annual ACM Symposium on Applied Computing, 2018, pp. 1341–1350.

24.

Eirinaki

, Gao

, Varlamis

and Tserpes

, Recommender systems for large-scale social networks: A review of challenges and solutions, Future Generation Computer Systems 78 (2018), 413–418.

25.

Al-Shamri

M.Y.H.

, User profiling approaches for demographic recommender systems, Knowledge-Based Systems 100 (2016), 175–187.

26.

Safoury

and Salah

, Exploiting user demographic attributes for solving cold-start problem in recommender system, Lecture Notes on Software Engineering 1 (2013), 303–307.

27.

Khan

M.M.

, Ibrahim

, Younas

, Ghani

and Jeong

S.R.

, Facebook interactions utilization for addressing recommender systems cold start problem across system domain, Journal of Internet Technology 19 (2018), 861–870.

28.

Dixit

V.S.

and Jain

, Recommendations with Sparsity BasedWeighted Context Framework, in International Conference on Computational Science and Its Applications, 2018, pp. 289–305.

29.

Ahn

H.J.

, A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem, Information Sciences 178 (2008), 37–51.

30.

Formoso

, FernáNdez

, Cacheda

and Carneiro

, Using profile expansion techniques to alleviate the new user problem, Information Processing & Management 49 (2013), 659–672.

31.

Attar

and Fraenkel

A.S.

, Local feedback in full-text retrieval systems, Journal of the ACM (JACM) 24 (1977), 397–417.

32.

Acilar

A.M.

and Arslan

, A collaborative filtering method based on artificial immune network, Expert Systems with Applications 36 (2009), 8324–8332.

33.

Guo

, Improving the performance of recommender systems by alleviating the data sparsity and cold start problems, in Twenty-Third International Joint Conference on Artificial Intelligence, 2013, pp. 3217–3218.

34.

Shaw

, Xu

and Geva

, Using association rules to solve the cold-start problem in recommender systems, in Pacific-Asia conference on knowledge discovery and data mining, 2010, pp. 340–347.

35.

Liu

, Chen

, Xiong

, Ding

C. H.

and Chen

, Enhancing collaborative filtering by user interest expansion via personalized ranking, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 42 (2011), 218–233.

36.

Demšar

, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research 7 (2006), 1–30.

37.

Karypis

, Evaluation of item-based top-n recommendation algorithms, in Proceedings of the tenth international conference on Information and knowledge management, 2001, pp. 247–254.

38.

Cremonesi

, Koren

and Turrin

, Performance of recommender algorithms on top-n recommendation tasks, in Proceedings of the fourth ACM conference on Recommender systems, 2010, pp. 39–46.

39.

Bambini

, Cremonesi

and Turrin

, A recommender system for an IPTV service provider: A real large-scale production environment, in Recommender systems handbook, ed: Springer, 2011, pp. 299–331.

40.

Cui

, Zhu

and Yao

, Ontology-based Top-N Recommendations on new items with matrix factorization, Journal of Software 9 (2014), 2026–2032.

41.

Mojarad

, Nejatian

, Parvin

and Mohammadpoor

, A fuzzy clustering ensemble based on cluster clustering and iterative Fusion of base clusters, Applied Intelligence 49(7) (2019), 2567–2581.

42.

Mojarad

, Parvin

, Nejatian

and Rezaie

, Consensus function based on clusters clustering and iterative fusion of base clusters, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 27(1) (2019), 97–120.

43.

Abbasi

, Nejatian

, Parvin

, Rezaie

and Bagherifard

, Clustering ensemble selection considering quality and diversity, Artif. Intell. Rev. 52 (2019), 1311–1340.

44.

Nazari

, Dehghan

, Nejatian

, Rezaie

and Parvin

, A comprehensive study of clustering ensemble weighting based on cluster quality and diversity, Pattern Anal. Appl. 22 (2019), 133–145.

45.

Bagherinia

, Minaei-Bidgoli

, Hossinzadeh

and Parvin

, Elite fuzzy clustering ensemble based on clustering diversity and quality measures, Appl. Intell. 49 (2019), 1724–1747.

46.

Rashidi

, Nejatian

, Parvin

and Rezaie

, Diversity based cluster weighting in cluster ensemble: An information theory approach, Artif. Intell. Rev. 52 (2019), 1341–1368.

47.

Jenghara

M.M.

, Ebrahimpour-Komleh

and Parvin

, Dynamic protein-protein interaction networks construction using firefly algorithm, Pattern Analysis and Applications 21(4) (2018), 1067–1081.

48.

Hosseinpoor

M.J.

, Parvin

, Nejatian

and Rezaie

, Gene regulatory elements extraction in breast cancer by Hi-C data using a meta-heuristic method, Russian Journal of Genetics 55(9) (2019), 1152–1164.

49.

Nejatian

, Parvin

and Faraji

, Using sub-sampling and ensemble clustering techniques to improve performance of imbalanced classification, Neurocomputing 276, 55–66.

50.

Parvin

, Nejatian

and Mohamadpour

, Explicit memory based ABC with a clustering strategy for updating and retrieval of memory in dynamic environments, Applied Intelligence 48(11) (2018), 4317–4337.

51.

Mao

and Hou

, Object-based forest gaps classification using airborne LiDAR data, Journal of Forestry Research, 30(2) (2019), 617–627.

52.

Pirbonyeh

, Rezaie

, Parvin

, Nejatian

and Mehrabi

, A linear unsupervised transfer learning by preservation of cluster-and-neighborhood data organization, Pattern Analysis and Applications 22(3) (2019), 1149–1160.

53.

Nejatian

, Rezaie

, Parvin

, Pirbonyeh

and Bagherifard

, Yusof SKS, An innovative linear unsupervised space adjustment by keeping low-level spatial data structure, Knowledge and Information Systems 59(2) (2019), 437–464.