Matrix factorization completed multicontext data for tensor-enhanced recommendation

Abstract

Tensors have been explored to share latent user-item relations and have been shown to be effective for recommendation. Tensors suffer from sparsity and cold start problems in real recommendation scenarios; therefore, researchers and engineers usually use matrix factorization to address these issues and improve the performance of recommender systems. In this paper, we propose matrix factorization completed multicontext data for tensor-enhanced algorithm a using matrix factorization combined with a multicontext data method for tensor-enhanced recommendation. To take advantage of existing user-item data, we add the context time and trust to enrich the interactive data via matrix factorization. In addition, Our approach is a high-dimensional tensor framework that further mines the latent relations from the user-item-trust-time tensor to improve recommendation performance. Through extensive experiments on real-world datasets, we demonstrated the superiority of our approach in predicting user preferences. This method is also shown to be able to maintain satisfactory performance even if user-item interactions are sparse.

Keywords

Recommendation system tensor factorization similarity user-project context interaction

1 Introduction

Currently, in a network environment in which data exhibit explosive growth, when faced with a large amount of information, it is difficult for users to obtain information that is useful to them, which reduces the efficiency of information use. Currently, the best method to increase the use rate of information is a recommendation system [1]. A recommendation system can provide personalized recommendations based on a user’s preferences and the available information so that the user can obtain the content he or she likes. At present, many recommendation algorithms have been proposed in different scenarios, such scenes with time information and tag information, etc. [2, 3]. Among the existing recommendation algorithms, collaborative filtering (CF) is popular and one of the most widely used recommendation algorithms [4]. The collaborative filtering method utilizes the implicit or displayed feedback of users to make recommendations without domain information. Currently, there are user-based and item-based collaborative filtering algorithms [5]. User-based collaborative filtering focuses more on socialization and is used more in news recommendations. Item-based collaborative filtering algorithms focus more on users’ historical behavior, and recommendations are more personalized. But the CF algorithm has weak adaptability to new samples (Weak generalization ability). In order to solve this problem, matrix factorization is proposed.The concept of hidden vector is added to matrix decomposition, which strengthens the model’s ability to deal with sparse matrices [41, 42]. Matrix factorization usually associates information in two dimensions. In practical applications, user preferences are affected by many factors, such as time, social information, etc [5 –7]. In the process of designing a recommendation algorithm, we consider adding influencing factors to improve the prediction accuracy. Therefore, we expand the matrix to a high-dimensional space and call this expanded high-dimensional matrix a tensor. A tensor can incorporate more information to mine the relationships for recommendation.

Higher-dimensional tensors mean that more effective information can be used. Adding valid information related to the user-item relationship can improve prediction accuracy [6 –8]; In our model, we consider the influences of time [10] and user trust relationship information [32] on the recommendation system. Time information reflects that users’ behavioral preferences are dynamic, and they like different items at different time points. At the same time, people with similar preferences usually have similar choices in daily life. On this basis, we quantify their preference, which is called trust relationship. The higher the trust between two users, the more likely they are to like the same item. The addition of trust information can alleviate the “cold-start problem”. The “cold start problem” is caused by the fact that there is less new user data and the model cannot make personalized recommendations for users. Therefore, more information needs to be added for new users to alleviate this problem. In recent years, many models have integrated time information on the basis of CF [10 –12]. However, their models only recommend the observed discrete time points, and they have difficulty addressing the “cold-start problem”. Rafailidis and Daras proposed a tensor factorization method based on label clustering [13], which clustered label information to alleviate the sparsity of data. Ziwei Zhu et al. propose a novel fairness-aware tensor recommendation framework that is designed to maintain quality while dramatically improving fairness [14], which is suitable for user friend location activity, but this model does not incorporate time information and has certain limitations. Our model is a 4-dimensional tensor that combines trust information with user and item and time information. It mines the relationships between users and items under trust relationships in different time periods because under normal circumstances, user interactions with items are sparse, especially in a high-dimensional tensor space with other information added. To solve the problem of data sparsity, we use traditional matrix decomposition to preprocess the tensor model, which reduces the sparsity of data to some extent. For tensor model decomposition, we adopt the block item tensor decomposition method (BTD) proposed by Lieven De Lathauwer [15], which Kun Tang et al. applied in traffic prediction [16] and achieved good results. The main contributions of our model are as follows:

Before decomposition and prediction, to pretreat the tensor, the two steps of the tensor slicing operation and matrix decomposition are selected. The purpose is to reduce the sparseness of the data.

Using the four-dimensional tensor of user trust and time, the project matrix of traditional users is pushed to a higher dimension, which can combine more information to mine more potential relationships and improve the prediction accuracy.

The block item tensor decomposition method combines traditional CANDECOM/PARAFAC (CP) and Tucker decomposition to decompose and predict the tensor model [43]. This method not only overcomes the nonuniqueness of Tucker’s method, but it also carries more information than the low-rank item of CP decomposition.

The remainder of this article is as follows. Section 2 introduces the application of tensors in the related research. Section 3 specifically introduces the related definitions and the algorithm of this paper. Section 4 conducts experiments to compare this paper’s method with traditional algorithms and analyzes the algorithms’ performances based on the experimental results. The fifth part is the summary and outlook.

2 Related studies

At present, many recommendation algorithms based on tensors have been proposed. Symeonidis et al. proposed a tensor recommendation algorithm with tag context factors [19], constructed a three-dimensional tensor of user item labels, and used sparse high-order singular value decomposition (HOSVD) [46] to decompose the tensor into a matrix and a core tensor. HOSVD is a high-order generalization of matrix Singular Value Decomposition (SVD) [18]. This method is suitable for the decomposition of dense tensors, but not sparse tensors, and HOSVD may not find the best low-rank approximation of tensors; therefore, the actual application prospects are not good. Hidasi and Tikk et al [20] applied a tensor factorization method based on the least squares (ALS) to implement context-aware recommendations [21,22, 21,22]. However, the disadvantage of this method is that it has poor convergence for sparse data and cannot be expanded to large data sets. Stathis Maroulis et al proposed a personalized recommendation model that adds points of interest (POIs) to increase user experience through contextual POI awareness [38]. Hao Wen and others [39] added time weights to their models, which can reflect the changes in user interests over time. Symeonidis et al [24] also proposed a geographic recommendation system based on the friend contact algorithm and high-order singular value decomposition algorithm, which is suitable for user friend location activities. However, this method still cannot incorporate time information. In order to apply spatiotemporal information to point of interest recommendations, Ying et al. [25] used context based on user preferences to perform tensor factorization. They also proposed a POI user preference inference model based on the weighted hyperlink-induced topic search (HITS). Kun Tang et al. formed a three-dimensional tensor of a driver’s road segment time and formed a model with features such as the geographic space [16] to recommend personalized travel plans to users. Pei Ma et al. used the similarity between users to construct a three-dimensional tensor model for recommendation [32] but did not consider the time factor, and user preferences may change in different time periods.

To solve the above problems, we add time information, which can reflect the changes of users’ preferences in different time periods. The decomposition method is partly based on the BTD algorithm, which is different from HOSVD, which only estimates the unknown term according to its own non-zero term and ignores other information, BTD approximates by a sum of low multilinear rank terms [15, 17]. Compared with the above method, the accuracy of the experimental results has been improved. On this basis, aiming at the problem of data sparsity, matrix decomposition is used to pre-fill the unknown positions of the tensor, which reduces the sparsity of the tensor to a certain extent [30].

3 Tensor model

3.1. Related overview

Tensor:

A tensor can be regarded as an extension of a low-dimensional array. It has three major advantages when processing data, namely, dimension reduction processing, missing data filling, and implicit relationship mining. A one-order tensor is called a vector, a two-order tensor is called a matrix, and three-order and above tensors are called tensors. The values in a three-dimensional tensor usually have three coordinates. Our model consists of time-trust-user-item together to form a four-dimensional tensor.

Tensor slicing:

The tensor slicing operation is an operation to extract a matrix from a tensor. If two dimensions are reserved in a tensor, other dimensional changes can result in a matrix, which is a tensor slice. An example is a user project time three-order tensor $D \in ℝ^{I \times J \times K}$ , I, J and K represent different dimensions. The matrix we obtain after slicing its time dimension is $T_{k} \in ℝ^{I \times J}$ , and the number of matrix slices is related to the size of the time dimension.

User trust:

Friends or relatives around a user may affect the user’s choices. For example, if a user wants to buy a piece of clothing, he usually consults his friends who have similar clothing styles. There is a certain degree of similarity between different users.

Matrix decomposition:

Matrix A*B is used to approximate matrix M. Then, the value obtained by A*B can be used to estimate the unknown value in matrix M, and matrices A and B can be regarded as the decomposition of matrix M [31], M ≈ A * B.

3.1 Description of related operations:

Definition 1. Given two matrices $A \in ℝ^{I \times J}$ and $B \in ℝ^{K \times L}$ , the Kronecker product of the two matrices means that $A \otimes B \in ℝ^{IK \times JL}$ is defined as: $A \otimes B = (\begin{matrix} a_{11} B & a_{12} B & \dots & a_{1 J} B \\ a_{21} B & a_{22} B & \dots & a_{2 J} B \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{I 1} B & a_{I 2} B & \dots & a_{IJ} B \end{matrix})$

Definition 2. The Khatri-Rao product of two matrices $A \in ℝ^{I \times J}$ and $B \in ℝ^{K \times J}$ is expressed as $A ⊙ B = ℝ^{IK \times J}$ , and J means that the number of matrix columns is equal. This is defined as: $A ⊙ B = (\begin{matrix} a_{1} \otimes b_{1} & a_{2} \otimes b_{2} & \dots & a_{J} \otimes b_{J} \end{matrix})$

Definition 3. A tensor k-mode product is represented by the symbol ×_n, which is equivalent to the product of the tensor in the matrix direction. E.g.: $T = Y \times_{U} U \Leftrightarrow T_{ijk} = \sum_{i = 1}^{n} Y_{ijk} U_{ij}$

Definition 4. The outer product operation of a tensor is expressed as A ∘ B, where $A \in ℝ^{I_{1} \times I_{2} \times \dots \times I_{P}}$ and $B \in ℝ^{J_{1} \times J_{2} \times \dots \times J_{P}}$ . The outer product is defined as: $(A \circ B)_{i_{1} i_{2} \dots i_{P} j_{1} j_{2} \dots j_{Q}} = a_{i_{1} i_{2} \dots i_{P}} b_{j_{1} j_{2} \dots j_{Q}}$ model framework is shown in Fig. 1. The entire model is roughly divided into three steps. Step 1: Extract the trust value between users. Step 2: Combine the extracted user, project, time, trust and other information into a four-dimensional tensor model D. Because of the sparsity of the high-dimensional tensor space, we slice the tensor model, then decompose and fill the slice. Step 3: Decompose the processed tensor model and mine the implicit relationship in the information for prediction.

Fig. 1

Framework model.

3.2 Methodology

Tensor model:

We collect various user information from the data set and extract trust and time information to construct a fourth-order tensor $D \in ℝ^{I \times J \times T \times K}$ . Among the components, I represents a user dimension, J is the corresponding project dimension, K is the similarity between users, and T is the time dimension of the user’s rating of the project. Di, j, t, k = α represents the trust-based score of user i to project j at time t. Because high-order tensors are difficult to process directly, we need to decompose them into low-order matrices for processing. It is relatively easy to use the connection between matrices and tensors to process tensors. In this article, we use the tensor block decomposition algorithm (BTD) to decompose the tensor [15].

Contextual information:

User behavior is usually influenced by friends or other people. For people with the same likes or interests, when one user likes a new item, the other user is likely to like this item.For example, user a likes items N and M, and another user b likes item N. Because user a and b both like item N, we can infer that user b may like item M. Because they have common preferences, there is a certain similarity between user a and user b; and the more items they like in common, the higher the degree of similarity between users. Trust information can be used to implement personalized recommendations in the recommendation process, find another user with similar interests, and recommend to the recommended user an item that the user likes but the recommended user has not touched. The similarity between users is calculated after different users rate the same item. For example, take user U and user V, where N(u) represents the positive feedback information of user U, and N(v) represents the positive feedback information of user V. They have given feedback on some items, and their similarity can be calculated according to the feedback of the two users. The feedback on the same item is used to calculate the similarity between users U and V. The formula is as follows: $w_{uv} = \frac{\sum i \in N (u) \cap N (v) \frac{1}{log (1 + | N (i) |)}}{\sqrt{| N (u) | | N (v) |}}$

Equation (1) is the similarity penalty for popular items in the common interest list of user U and user V. Matrix Y is a scoring matrix based on the similarity of items. Different items i and j have certain similarities, which can be calculated by formula Equation (2). Finally, the information is integrated into the tensor model for prediction. The time information can reflect the changes in user interests. Users will like different items at different points in time. For example, users will want to buy down jackets or cotton-padded clothes in the winter and t-shirts in the summer; the books that researchers read will also change with the depth of their research, initially reading some introductory books and then transitioning to professional books. Therefore, time information is also one of the more important influencing factors in personalized recommendation systems. $\frac{1}{log (1 + | N (i) |)}$ (1)

$w_{ij} = \frac{| N (i) \cap N (j) |}{\sqrt{| N (i) | | N (j) |}}$ (2)

Data preprocessing

Tensor preprocessing: Data preprocessing is performed in two steps: tensor slicing and matrix decomposition. The processing steps are shown in Fig. 2. A is the user item matrix slice cut out of the model. We decompose the matrix sliced in the model, taking into account the actual meaning of the matrix. The value of the matrix cannot be negative. There is a m × n rating matrix R, where r_ij is the rating of user i on item j. R can be decomposed into R = UV^T. U is the m × k user matrix, and V is the n × k item matrix, where k is the decomposed user and item hidden correlation dimension. The matrix factorization process is similar to the tensor factorization process, which is an approximate process, R ≈ UV^T. We need to acquire the smallest error and construct an objective function: $min J = \frac{1}{2} ∥ R - {UV}^{T} ∥$ . Normally, matrix R is sparse, and we need to fill it with known data. The position of a point in the matrix is denoted as (i, j). We can write the estimated score as $r_{ij}^{'} = \sum_{q = 1}^{r} u_{iq} v_{jq} i = 1, \dots, n; j = 1, \dots, m$ .

Fig. 2

Framework model.

For the optimization of the objective function, this paper adopts the classic stochastic gradient descent method. This method takes n samples to calculate each time, and the update time is less and the speed is fast. The update formula is as follows: $\begin{matrix} minJ = \frac{1}{2} \sum_{(i, j) \in S} (r_{ij} - \sum_{q = 1}^{k} u_{iq} v_{iq})^{2} \\ u_{iq} \Leftarrow u_{iq} + α \sum_{j : (i, j) \in S} (r_{ij} - \sum_{q = 1}^{k} u_{iq} v_{iq}) v_{jq} \\ v_{jq} \Leftarrow v_{jq} + α \sum_{i : (i, j) \in S} (r_{ij} - \sum_{q = 1}^{k} u_{iq} v_{iq}) u_{iq} \end{matrix}$

where α > 0 is the step size of the stochastic gradient descent, and j : (i, j) ∈ S, I : (i, j) ∈ S is the index of the nonzero elements in the corresponding vector. By updating, we can fill the matrix to obtain a new matrix.

Core initialization: To perform context-aware tensor decomposition, we need to initialize a core tensor and four solution matrices. The purpose is to generate an initialization tensor for the optimization of the subsequent algorithm. Due to the nonconvexity of the model, the final solution depends on the initialization to some extent. Normally, random initialization is used to initialize the tensor with random values in the interval [0,1]. Random initialization cannot well control the accuracy of the final result. At present, there are other initialization methods [33], and good hyperspectral results have been obtained. In this paper, considering the performance of the model and without introducing more external factors, the random initialization method is used in the experiment.

Block item decomposition (BTD):

There are many ways to decompose a tensor model, and different methods have different advantages. Before introducing the block item decomposition algorithm, we first introduce the classic Tucker/high-order odd value decomposition (HOSVD) and CP decomposition.

Definition 5. CP decomposes a tensor into multiple tensors of rank 1 such that there is a third-order tensor $χ \in ℝ^{I \times J \times K}$ decomposition form $χ = \sum_{r = 1}^{R} ω_{r} (a_{r} \circ b_{r} \circ c_{r})$

The CP decomposition of the third-order tensor is shown in Fig. 3(a). The rank of tensor X is the smallest number of tensors of rank 1 in the linear combination. Among them,

Fig. 3

Tucker/CP decomposition visualization diagram.

$A {= [a}_{1}, \dots, a_{R}] \in ℝ^{I \times R}$ ${B = [b}_{1}, \dots, b_{R}] \in ℝ^{J \times R} {C = [c}_{1}, \dots, c_{R}] \in ℝ^{K \times R}$

Definition 6. The Tucker decomposition of a third-order tensor $χ \in ℝ^{I \times J \times K}$ is $χ \in S \times_{1} A \times_{2} B \times_{3} C$

A schematic diagram of the Tucker decomposition is shown in Fig. 3(b).

Similar to traditional matrix factorization, the core tensor of tensor factorization represents the implicit connection between each dimension. We cannot explain what this implicit relationship represents, but we can use these relationships to make effective predictions. The CP decomposition method requires prior knowledge of the rank of the original tensor. To date, there is no direct algorithm to determine the rank of a tensor. In actual situations, this is a difficult problem of np. Generally, for simple consideration, the value of max(I, J, K) is used as the rank of the tensor in the CP decomposition. Tucker decomposition itself is not unique.

Our experiment uses the tensor block decomposition algorithm (BTD) because it unifies the traditional Tucker/high-order odd value decomposition (HOSVD) and CP decomposition. The results of the traditional Tucker decomposition are not unique while the CP decomposition decomposes a tensor into the sum of multiple ranks. However, the features or information that a rank can carry because of its rank is very limited. The BTD decomposition unifies the two decompositions to make the decomposition unique. BTD decomposes a tensor into the sum of multiple low linear rank items. Each low linear rank item may have a different multilinear rank. In addition, each item can carry more information or features, as shown in Fig. 4.

Fig. 4

Visualization of the decomposition of a tensor into the rank sum.

For a tensor D ∈ R^I×J×K, the decomposition and form of a rank (L, M, N) of D are: $D = \sum_{r = 1}^{R} S_{r} \times_{A} A_{r} \times_{B} B_{r} \times_{C} C_{r},$

where S_r ∈ K^L×M×N is full rank (L, M, N), A_r ∈ K^I×L, B_r ∈ K^J×M, and C_r ∈ K^K×N.

Based on the BTD paradigm, our context-aware recommendation model can be expressed as an optimization problem. We call the model in this article MCA-BTD, and the optimized objective function is: $\begin{matrix} minL = \frac{1}{2} \sum_{i, j, t, k \in D} I (i, j, t, k) \\ \begin{matrix} \end{matrix} \times {(D - \sum_{r = 1}^{R} S_{r} \times_{A} A_{r} \times_{B} B_{r} \times_{C} C_{r} \times_{E} E_{r})}^{2} \\ \begin{matrix} \end{matrix} + \frac{λ}{2} (\sum_{r = 1}^{R} {∥ S_{r} ∥}^{2} + {∥ A_{r} ∥}^{2} + {∥ B_{r} ∥}^{2} + {∥ C_{r} ∥}^{2} + {∥ E_{r} ∥}^{2}) \end{matrix}$

where $D \in ℝ^{I \times J \times T \times K}$ ; $A \in ℝ^{I \times L} B \in ℝ^{J \times M}, C \in ℝ^{T \times N}, E \in ℝ^{K \times O}$ are all latent factor matrices; and L, M, N, and O are latent factors, that is, those that determine the size of the core tensor. $D = \sum_{r = 1}^{R} S_{r} \times_{A} A_{r} \times_{B} B_{r} \times_{C} C_{r} \times_{E} E_{r}$ is used for recovery. In the formula, ∥S_r ∥ ², ∥A_r ∥ ², ∥B_r ∥ ², ∥C_r ∥ ² and ∥E_r ∥ ² are regularization terms, which play roles in preventing overfitting in the model. ∥· ∥ is the L2 norm, which is defined as $∥ X ∥ = \sqrt{{| x_{1} |}^{2} + {| x_{2} |}^{2} + \dots + {| x_{n} |}^{2}}$ . λ is the weight parameter. I (i, j, k) is defined as follows: $I (i, j, t, k) = {\begin{matrix} \begin{matrix} 1 & \begin{matrix} if & A (i, j, t, k) \neq φ \end{matrix} \end{matrix} \\ \begin{matrix} 0 & \begin{matrix} \end{matrix} & otherwise \end{matrix} \end{matrix}$

The algorithm is as follows in Fig. 5.

Fig. 5

BTD block item decomposition algorithm.

4 Experiments

4.1 Dataset

Our article uses the Ciao 1 and MovieLens 2 public data sets that have a trust relationship. Note that the MovieLens data set does not have a trust relationship, and we need to calculate it [33, 43].

The Ciao data set was released in 2011 and captured user social and rating information on the Ciao website. Sufficient registration on the site allows users to rate and comment on the items and to browse other users’ comments and ratings on items, which can help users make appropriate choices. A trust relationship can be established between users on the website. The detailed descriptive information is shown in Table 1.

Table 1
Statistical information of the Ciao and MovieLens datasets

DATASET CIAO MOVIELENS-100K

USERS 7375 1000

ITEMS 99,746 1700

USER-ITEM RATINGS 278,483 100,000

RATINGS SCALE 1–5 1–5

DENSITY 0.038% 6.3%

USER TRUST 111,781 51,959

DATASET	CIAO	MOVIELENS-100K
USERS	7375	1000
ITEMS	99,746	1700
USER-ITEM RATINGS	278,483	100,000
RATINGS SCALE	1–5	1–5
DENSITY	0.038%	6.3%
USER TRUST	111,781	51,959

The MovieLens dataset is an available scoring dataset collected by GroupLens from MovieLens. The dataset has different sizes and is collected in different time periods. We selected the 100K dataset, which a dataset in which 1000 users rated 1700 movies 100,000 times (1–5). Each of these users has watched at least 20 movies, and users’ information such as their age, gender, and occupation is included. The detailed descriptive information is shown in Table 1.

4.2 Evaluation index

The performance evaluation of an algorithm is usually based on the accuracy of the recommended prediction. For general classification models, the main indicators for evaluating the accuracy of recommendations are the following: the recall, the accuracy, and the coverage. For the score prediction recommendation model, we generally use the root mean square error (RMSE) and the mean absolute error (MAE) as indicators of the evaluation accuracy. Our model is mainly used to predict the user ratings of the item, so we use the MAE and RMSE as evaluation indicators to evaluate the performance of the algorithms. These two indicators are commonly used in a variety of different scenarios. Generally, the smaller the value is, the higher the accuracy, and the better the algorithm performance. The specific definition is shown in Table 2, where N is the total number of predicted ratings and D is the test set data.

Table 2
Validation metrics

Metrics Definition Best or worst

MAE $MAE = \frac{1}{N} \sum_{(i, j, k) \in T} | D_{ijk} - {\hat{D}}_{ijk} |$ The smaller, the better

RMSE $RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(D_{ijk} - {\hat{D}}_{ijk})}^{2}}$ The smaller, the better

Metrics	Definition	Best or worst
MAE	$MAE = \frac{1}{N} \sum_{(i, j, k) \in T} \| D_{ijk} - {\hat{D}}_{ijk} \|$	The smaller, the better
RMSE	$RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(D_{ijk} - {\hat{D}}_{ijk})}^{2}}$	The smaller, the better

4.3 Experimental results and analysis

Comparison of innovative points

We verify the advantages of our model from the following three aspects.

The low-rank tensor decomposition of trust relations (LTF-ISTR) [32]. This method combines user trust values to form a three-dimensional tensor model to predict user ratings and achieves good results. The article uses the Tucker decomposition to make decomposition predictions for the entire model, but it is a factor of time.

Verify the effect of the time factor in personalized recommendation. The time factor reflects the changes in user preferences and interests. Each person’s general preferences will be affected by many factors, and clothing and diet will be affected by seasonal changes. Therefore, we experiment to compare the effects of this information.

The effect of matrix filling on the experimental results. The experimental four-dimensional tensor will have the problem of sparseness, so we analyze this problem. We compare the improvement effect of the tensor model before and after filling on the final experimental results.

The purpose of comparing the experiments using the decomposition of the trust relationship tensor is to compare the BTD and Tucker methods and show the superiority of the BTD algorithm to a certain extent. Our experiments are compared on the Ciao data set in order to better guarantee that the experimental results are not affected by other features. The time dimension of our model is removed, and the same features as the method are used for comparison. The experimental results are shown in Table 3, where LTF-ISTR is the trust relationship tensor Tucker decomposition model, and CA-BTD is our trust BTD model. The experimental results show that the BTD method, which improves the accuracy of the experiment to a certain extent, is superior to the traditional Tucker method.

Table 3
Comparison of the Tucker decomposition and BTD under the same characteristics

RMSE MAE

LTF-ISTR 0.8450 0.6511

CA-BTD 0.8284 0.6328

	RMSE	MAE
LTF-ISTR	0.8450	0.6511
CA-BTD	0.8284	0.6328

We then proceed to determine the impact of time features on the prediction accuracy. We add time features on the basis of user-item-trust and analyze the improvement results of time features on the prediction model. The experimental results are shown in Table 4. Among the models, CA-BTD-3 represents a three-dimensional tensor model with only user-trusted items, and CA-BTD-4 represents a four-dimensional model with the addition of time factors. The experimental results show that time can significantly improve the model accuracy. This reflects that users’ preferences vary greatly over time, so time is a very important contextual feature in a personalized recommendation system. Aiming at the sparsity problem of the high-dimensional tensor space, we use traditional matrix factorization to fill the matrices sliced by the tensor to alleviate the sparsity of the model and further improve the prediction accuracy of the model. Our specific test compares the model without matrix filling to the model with matrix filling. We named the model without filling the CA-BTD and the model with filling the MCA-BTD. The experimental results are shown in Table 5. The experimental results reflect that matrix filling is effective since it reduces the sparsity of the model and improves the accuracy of the algorithm.

Table 4

Comparison of time characteristics

	RMSE	MAE
CA-BTD-3	0.8384	0.6228
CA-BTD-4	0.8028	0.5872

Table 5

Comparison of matrix filling results

	RMSE	MAE
CA-BTD	0.8026	0.5925
MCA-BTD	0.7861	0.5635

Through the above comparative experiments, we have separately addressed the three innovations of this article. The addition of the time dimension makes the model accuracy the best, reflecting that that context information is very important information for the personalized recommendation model. Information corresponds to the characteristics of different users, and we can reasonably use this information to make more accurate predictions of user behavior.

Algorithm comparison

We combine the model of this paper with the matrix decomposition time-trust context decomposition algorithm to compare the results of experiments with the results of other recommendation algorithms to test the performance of the algorithm. Our main comparison algorithms are as follows:

The collaborative filtering algorithm based on nonnegative matrix factorization (NMF) [35] is a privacy-preserving collaborative filtering algorithm combining random perturbation technology with nonnegative matrix factorization, which can protect user privacy while also generating recommended results.

This method is an evolution of the KNN algorithm [44]. It uses the information close to the recommended date to help improve the recommendation results and reduce the information overload of the recommendation engine.

User preference prediction recommendation based on SVD++ [36]. This method first uses a logistic regression to preprocess the original data, including click-through rates, shopping carts, purchase items, etc.; and then it uses the SVD++ model to analyze the processed data. This conducts the predictive analysis of data.

The pairwise interaction tensor factorization (TF) [45] method is a special case of the traditional tensor decomposition model. The running time is linear, and it conducts pairwise interaction modeling between user items and related information.

This article propose a new rating prediction model named the Rating-Trust-based Recommendation Model (RTRM) [40] to explore the influence of internal factors among the users.

We compared the above algorithms and the algorithm in this paper on the Ciao data set, and the comparison results is shown in Fig. 6. The experimental results show the relative superiority of the algorithm in this paper compared to the above algorithms. The two indicators of the RMSE and MAE show that the algorithm in this paper achieves the best prediction accuracy. Compared with RTRM, our method adds time information to reflect the user’s interest changes at different times,more contextual information can improve the prediction results to a certain extent. The BTD method extends the TF method by decomposing the tensor into a sum of low multilinear ranks. The BTD method is more accurate than the TF method. Benefiting of the original tensor pretreatment by matrix decomposition and iteration initialization preprocessing, our model is more accurate than the above methods. Figure 7 shows the comparative experimental results of various algorithms on the MovieLens data set. The analysis of the experimental results shows the performance of each algorithm is improved compared with that on the Ciao data set. The reason is that the density of the MovieLens data set is higher and the sparsity of the MovieLens dataset is lower than those of the Ciao data set. It can be seen that data sparsity has a certain impact on the prediction accuracy.

Fig. 6

Algorithm comparison on the Ciao data set.

Fig. 7

Algorithm comparison on the MovieLens data set.

In order to test the performance of each algorithm on different sparse data, we conduct different sparsity comparative experiments on the Ciao data set. The experimental results are shown in Table 6.

Table 6

Results of the MAE and RMSE for each algorithm with Ciao dataset densities of 90%, 80%and 70%

	90%
	MCA-BTD	RTRM	TF	SVD++	KNN-Baseline	NMF
MAE	0.578	0.598	0.623	0.728	0.746	0.845
RMSE	0.779	0.808	0.832	0.940	0.955	1.145
	80%
	MCA-BTD	RTRM	TF	SVD++	KNN-Baseline	NMF
MAE	0.592	0.617	0.653	0.759	0.811	0.864
RMSE	0.796	0.831	0.868	0.952	1.021	1.135
	70%
	MCA-BTD	RTRM	TF	SVD++	KNN-Baseline	NMF
MAE	0.607	0.645	0.692	0.782	0.844	0.884
RMSE	0.810	0.842	0.911	0.992	1.058	1.152

The comparison results of the experiments show that our algorithm is relatively less affected by data sparseness under different data densities. Compared with the abovementioned traditional algorithms, the MCA-BTD algorithm has better performance in sparse conditions. In actual situations, user data are usually very sparse, so the MCA-BTD has more advantages in practical applications.

5 Conclusions

This paper proposes a method that uses a matrix to fill in the missing values in the tensor, and combines user trust and time information to predict user ratings. The trust relationship between users is first established on the existing data set, and then the processed data set is represented as a fourth-order tensor of users, projects, trust relationships and time information. Slice the tensor and use matrix decomposition to fill in missing items to reduce data sparsity. The BTD decomposition method is used to decompose the filled tensor. From our experimental results, our method has a good improvement in predicting user preferences.

For future work, we plan to (1) extend the single-level user trust relationship to multi-level user trust to improve the accuracy of user preference prediction; (2) try to add more user information. For example, the user’s age, gender, etc. These can all become research trends for potential information in future predictions.

Footnotes

Acknowledgments

This work was supported by the National Science Foundation of China under Grant No. 61867006, the Education Reform Project of Higher Education of Xinjiang Uygur Autonomous Region under a Study on the Application of Teaching Methods Combining MOOCs with a Flipping Classroom (No. 2018JG40), the Innovation Project of Sichuan Regional under Grant No. 2020YFQ2018, the Key Laboratory Open Project of Science & Technology Department of Xinjiang Uygur Autonomous Region under Research on Video Information Intelligent Processing Technology for Xinjiang Regional Security, and the Major Science and Technology Project of Xinjiang Uygur Autonomous Region under Grant No 2020A03001.

References

Bobadilla

, Ortega

, Hernando

and Gutiérrez

, Recommender systems survey, Knowl Based Syst 46 (2013), 109–132.

Ren

, Liang

, Meij

and de Rijke

Personalized time-aware tweets summarization, In: Proceedings of the 36th International ACMSIGIR Conference on Research and Development in Information Retrieval, pp. 513–522. ACM (2013).

Ren

, Peetz

M.-H.

, Liang

, van Dolen

and de Rijke

Hierarchical multilabel classifification of social text streams, In: Proceedings of the 37th International ACM SIGIR Conference on Research Development in Information Retrieval, pp. 213–222. ACM (2014).

Goldberg

, Nichols

, Oki

B.M.

and Terry

, Using collaborative fifiltering to weave an information tapestry, Communications of the ACM 35(12) (1992), 61–70.

Linden

, Smith

B.R.

and York

J.C.

Amazon.com recommendations: item-to-item collaborative filtering, IEEE Internet Computing (2003), 76–80.

, King

and Lyu

M.R.

, Mining web graphs for recommendations, IEEE Trans Knowl Data Eng 24 (2011), 1051–1064.

Zheng

V.W.

, Zheng

, Xie

and Yang

, Towards mobile intelligence: Learning from GPS history data for collaborative recommendation, Artif Intell 184 (2012), 17–37.

Prims

O.T.

, Castrillo

, Acosta

M.C.

, Oriol

M.V.

, Lorente

A.S.

, Serradell

, Cortés

and Francisco

J.D.

, R. Finding, analysing and solving MPI communication bottlenecks in Earth System models, , J Comput Sci 36 (2019), 100864.

Dr. Jayaswal

D.J.

Context Relevancy Assessment in Tensor Factorization-based Recommender Systems, IEEE-ACCESS, 2020, pp. 141–145.

10.

Koren

Collaborative fifiltering with temporal dynamics, In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 447–456, New York, NY (2009).

11.

Tang

, Chen

and Khattak

A.J.

, Personalized travel time estimation for urban road networks: A tensor-based context-aware approach, , Expert Systems With Applications 103 (2018), 118–132.

12.

H.-F.

, Rao

and Dhillon

I.S.

Temporal regularized matrix factorization for high-dimensional time series prediction, In Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., and R. Garnett, editors, Advances in Neural Information Processing Systems 29, pages 847–855. Curran Associates, Inc (2016).

13.

Rafailidis

and Daras

, The TFC model: Tensor factorization and tag clustering for item recommendation in social tagging systems, , IEEE Trans Syst ManCybern 43 (2012), 673–688.

14.

Zhu

, Hu

and Caverlee

Fairness-Aware Tensor Based Recommendation, Conference on Information and Knowledge Management, 2018, pp. 1153–1162.

15.

De Lathauwer

Decompositions of a higher-order tensor in block terms—partII: definitions and uniqueness*SIAM J. MATRIX ANAL, Society for Industrial and Applied Mathematics APPL, 2008, 1033–1066.

16.

Tang

, Chen

and Khattak

A.J.

, Personalized travel time estimation for urban road networks: A tensor-based context-aware approach, Expert Systems With Applications 103 (2018), 118–132.

17.

Sorber

and Van

, Barel and L. De Lathauwer, Structured data fusion, IEEE Journal of Selected Topics in Signal Processing 9(4) (2015), 586–600.

18.

Austin

We Recommend a Singular Value Decomposition, Feature Column from the AMS. Posted August 2009.

19.

Hidasi

and Tikk

Fast als-based tensor factorization for context-aware recommendation from implicit feedback, In ECML PKDD 2012, pages 67–82. Springer, 2012.

20.

Kolda

T.G.

Multilinear operators for higher-order decompositions, United States, Department of Energy, 2006.

21.

Kolda

T.G.

and Bader

B.W.

, Tensor decompositions and applications, SIAM Review 51(3) (2009), 455–500.

22.

Rafailidis

and Daras

, The TFC model: Tensor factorization and tag clustering for item recommendation in social tagging systems, , IEEE Trans Syst ManCybern Syst 43 (2012), 673–688.

23.

Symeonidis

, Papadimitriou

and Manolopoulos

Geo –social recommendations based on incremental tensor reduction and local path traversal, In Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Location- Based Social Networks, Chicago, IL, USA, 27 Oct. 2011, pp. 89–96.

24.

Ying

, Chen

and Chen

, A temporal-aware POI recommendation system using context-aware tensor decomposition and weighted HITS. , Neurocomputing 242 (2017), 195–205.

25.

Ifada

and Nayak

Tensor-based item recommendation using probabilistic ranking in social tagging systems, In Proceedings of the 23rd International Conference onWorld Wide Web, Seoul Korea, 7–11 Apr. 2014, pp. 805–810.

26.

Kannan

, Vempalat

and Vetta

On clusterings –good, bad and spectral, in: IEEE Symposium on Foundations of Computer Science, 2000, pp. 367–377.

27.

, Cai

and Niyogi

Tensor subspace analysis, in: Advances in Neural Information Processing Systems, MIT Press, 2005, pp. 499–506.

28.

Harshman

R.A.

, Foundations of the PARAFAC Procedure: Model and Conditions for an “explanatory” Multi-mode Factor Analysis, , UCLA Working Papers in Phonetics 16 (1970), 1–84.

29.

Kolda

and Sun

Scalable tensor decompositions for multi-aspect data mining, in: Eighth IEEE International Conference on Data Mining, 2008, ICDM’08, 2008, pp. 363–372.

30.

Aggarwal

C.C.

Recommender Systems, Cham: Springer International Publishing, 2016.

31.

, Wang

and Qin

, A Low-Rank Tensor Factorization Using Implicit Similarity in Trust Relationships, , Symmetry 12 (2020), 439.

32.

Jia

and Qian

, Constrained nonnegative matrix factorization for hyperspectral unmixing, IEEE Trans Geosci Remote Sens 47(1) (2009), 161–173.

33.

Yadav

, Kumar

, Sinha

and Nagpal

, Trust aware recommender system using swarm intelligence, J Comput Sci 28 (2018), 180–192.

34.

, Gao

and Du

NMF-based privacy-preserving recommendation algorithm, In Proceedings of the 2009 First International Conference on Information Science and Engineering, Nanjing, China, 26–28 December 2009, pp. 754–757.

35.

Jia

, Zhang

, Lu

and Wang

Users’ brands preference based on SVD++in recommender systems, In Proceedings of the 2014 IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA), Ottawa, ON, Canada, 29–30 September 2014.

36.

Chen

, Zhang

, Lu

and Chen

K.L.

, SVDFeature: A toolkit for feature-based collaborative filtering, , J Mach Learn Res 13 (2012), 3619–3622.

37.

Maroulis

, Boutsis

and Kalogeraki

Context-Aware Point of Interest Recommendation using Tensor Factorization, IEEE-ACCESS. (2016), pp. 963–968.

38.

Wen

, Jin

and Yang

Research on personalized recommendation algorithm combined with time factor, IEEE-ACCESS, (2017), pp. 1911–1915.

39.

Shi

, Wang

and Qin

, Extracting User Influence from Ratings and Trust for Rating Prediction in Recommendations, Scientific Reports 10(1) (2020), 13592.

40.

Natarajan

, Vairavasundaram

, Natarajan

and Gandomi

A.H.

, Resolving data sparsity and cold start problem in collaborative filtering recommender system using linked open data, Expert Systems with Applications 149 (2020), 113248.

41.

Mehta

and Rana

A review on matrix factorization techniques in recommender systems, In 2017 2nd International Conference on Communication Systems, Computing and IT Applications (CSCITA), pp. 269–274. IEEE, 2017.

42.

Kolda

T.G.

and Bader

B.W.

, Tensor Decompositions and Applications, SIAM Rev 51(3), 455–500. (46 pages).

43.

Hwang

W.S.

, Lee

H.J.

, Kim

S.W.

, Won

and Lee

M.S.

, Efficient recommendation methods using category experts for a large dataset, , Inf Fusion 28 (2016), 75–82.

44.

Campos

P.G.

, Bellogín

, Díez

and Chavarriaga

J.E.

Simple time-biased KNN-based recommendations, In Proceedings of theWorkshop on Context-AwareMovie Recommendation, Barcelona, Spain, 30 Sep. 2010; pp. 20–23.

45.

Karatzoglou

, Amatriain

, Baltrunas

and Oliver

Multiverse recommendation: N-dimensional tensor factorization for context-aware collaborative filtering, In Proceedings of the Fourth ACM Conference on Recommender Systems, Barcelona, Spain, 26–30 September 2010.

46.

Symeonidis

User Recommendations based on Tensor Dimensionality Reduction[C], Ifip International Conference on Artificial Intelligence Applications & Innovations, ACM, 2008.

Matrix factorization completed multicontext data for tensor-enhanced recommendation

Abstract

Keywords

1 Introduction

2 Related studies

3 Tensor model

3.1. Related overview

3.1 Description of related operations:

4.1 Dataset

Table 1 Statistical information of the Ciao and MovieLens datasets DATASET CIAO MOVIELENS-100K USERS 7375 1000 ITEMS 99,746 1700 USER-ITEM RATINGS 278,483 100,000 RATINGS SCALE 1–5 1–5 DENSITY 0.038% 6.3% USER TRUST 111,781 51,959

Table 2 Validation metrics Metrics Definition Best or worst MAE MAE = 1 N ∑ ( i , j , k ) ∈ T | D ijk - D ˆ ijk | The smaller, the better RMSE RMSE = 1 N ∑ i = 1 N ( D ijk - D ˆ ijk ) 2 The smaller, the better

Table 3 Comparison of the Tucker decomposition and BTD under the same characteristics RMSE MAE LTF-ISTR 0.8450 0.6511 CA-BTD 0.8284 0.6328

Footnotes

Acknowledgments

References

Table 1
Statistical information of the Ciao and MovieLens datasets

DATASET CIAO MOVIELENS-100K

USERS 7375 1000

ITEMS 99,746 1700

USER-ITEM RATINGS 278,483 100,000

RATINGS SCALE 1–5 1–5

DENSITY 0.038% 6.3%

USER TRUST 111,781 51,959

Table 2
Validation metrics

Metrics Definition Best or worst

MAE $MAE = \frac{1}{N} \sum_{(i, j, k) \in T} | D_{ijk} - {\hat{D}}_{ijk} |$ The smaller, the better

RMSE $RMSE = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(D_{ijk} - {\hat{D}}_{ijk})}^{2}}$ The smaller, the better

Table 3
Comparison of the Tucker decomposition and BTD under the same characteristics

RMSE MAE

LTF-ISTR 0.8450 0.6511

CA-BTD 0.8284 0.6328