Abstract
With the rapid development and popularization of smart mobile devices, users tend to share their visited points-of-interest (POIs) on the network with attached location information, which forms a location-based social network (LBSN). LBSNs contain a wealth of valuable information, including the geographical coordinates of POIs and the social connections among users. Nowadays, lots of trust-enhanced approaches have fused the trust relationships of users together with other auxiliary information to provide more accurate recommendations. However, in the traditional trust-aware approaches, the embedding processes of the information on different graphs with different properties (e.g., user-user graph is an isomorphic graph, user-POI graph is a heterogeneous graph) are independent of each other and different embedding information is directly fused together without guidance, which limits their performance. More effective information fusion strategies are needed to improve the performance of trust-enhanced recommendation. To this end, we propose a
Keywords
Introduction
Due to the recent breakthroughs in mobile location capture technologies and the quick development of Web 2.0, mobile users can share their travel experience with their friends on different social platforms in real-time. Users’ sharing activities and location data establish a connection between the social network and the outside world. The complicated system formed by these connections is usually called the location-based social network (LBSN). LBSNs integrate people’s offline visiting action with online check-in records, and become an important tool for users to post and share their check-in records [1,2,3]. Users can share their feelings about the points-of-interest (POIs) they have arrived at or toured (e.g., bars, hotels, shopping malls, and tourist attractions, etc.) with their friends on various LBSNs such as Gowalla, Brightkite, Foursquare [4]. An excellent sharing experience will attract more users to visit the corresponding point of interest. By utilizing past experiences, POI recommendations assist users in finding new and interested locations while guide advertisers on optimal ad placement to reach their target audience. Therefore, POI recommendations are of great commercial and research value, garnering increasing attention from more and more scholars [5,6,7].
POI recommendation generates a list of POIs for target users which they are most probably to be interested in and visit. The user’s check-in records, POI location data, and social connections are typically used to base the recommendation. Although there is much available information in LBSNs, it also makes the POI recommendation more challenging, especially when facing complex users’ social relationships. Users’ social relationships represent the implicit trust between users and provide additional information to the POI recommendation. Such additional information can help improve the recommendation results [8]. Up to now, there have been many studies on the application of the trust concept in POI recommendations. Researchers offer various trust-enhanced recommendation strategies as they believe that a user’s behavior is guided by their trustworthy friends [9,10,11,12]. They show that utilizing trust relationships in recommendations can improve performance. However, they have trouble integrating trust relationships because the social connections in real-world datasets are very sparse. There have been many methods to fuse it with other information, such as the location information of POIs, to defuse the situation so that the accuracy of trust-aware POI recommendation systems increases to a certain extent. A trust-aware location recommendation framework was created by Canturk et al. [13] that uses random walks to extract trust relationships and calculates a user’s overall trust score based on their check-in history. In order to combine the rich contextual information in LBSN, Zhang et al. [4] utilized a probabilistic diffusion approach and proposed a probabilistic diffusion-based context-enhanced approach. The approach extracted the links of explicit and implicit trust between users and added temporal influence to them.
Though the performance of recommendation has been improved by these researches on POI recommendation, researchers are still exploring whether there are more efficient ways to fuse information. In [14], the authors pointed out that the methods of mixing rating information and other auxiliary information can be divided into two categories: loosely coupling and tightly coupling, by judging whether there is a bidirectional interaction between rating information and other information. The loosely coupled approach processes the auxiliary information only once. The trust-aware methods mentioned above all integrated trust relationships with other information in a loosely coupled way, without using auxiliary information to guide the extraction of trust relationships. For example, Zhang et al. [4] applied probabilistic diffusion to calculate trust influence and geographic influence and these two computational processes are independent of each other. When fusing two kinds of information, it is just a simple multiplication without considering the different influences of different information. These two approaches obtain the embedding information of social relationships and other potential representations through certain techniques (such as probabilistic diffusion, etc.) and then fuse them together. For example, the embedding process of the user’s social relationship is carried out on the user-user graph, which is an isomorphic graph, and the embedding process of the user-POI relationship is proceeded on user-POI graph, which is a heterogeneous graph. The information embedded independently on different graphs with different properties is directly fused together in some way (such as concatenation, etc.) without guidance, which limits their performance. Therefore, we want to develop a way to assist and guide the extraction of embeddings and their fusion more accurately. The tightly coupled approaches can naturally balance the influence between different information by learning features from other assistant information automatically. Wang et al. [15] pointed out that tightly coupled approaches often perform better than loosely coupled approaches. Yin et al. [16] combined different types of auxiliary information of POI with users’ personal preferences through the two-way interaction of information and achieved good performance. This fully shows that the collaborative learning strategy for POI recommendation has a strong attraction in efficiently integrating different information compared with traditional methods. Benefiting from the excellent ability of integrating information through the two-way interaction of information, the collaborative learning strategy may contribute to integrating information in different graphs with different properties in trust-enhanced POI recommendation, so as to improve the performance of recommendation.
To this end, we propose a novel
The contribution of our paper can be summarized as follows:
We propose TECL model which makes the first attempt to apply a collaborative learning strategy for trust-aware POI recommendation to fuse trust information with other information more sufficiently in a tightly coupled way. The model integrates two modules, a GAT-based graph autoencoder as a trust relationships embedding module and a multi-layer deep neural network as a user-POI graph learning module, for collaborative learning through backpropagation of discrimination errors. We use Graph Attention Network (GAT) instead of Graph Isomorphism Network (GIN) as the graph autoencoder to embed trust relationships, which is able to better aggregate domain information and allows the model to dynamically determine the influence between nodes. In the GAT, each node focuses on its neighbor nodes to update its potential representation and selects the most trustworthy neighbors in the embedding of trust relationships. We demonstrate the efficient performance of the proposed TECL method by comparing it with other state-of-the-art trust-enhanced approaches through various experiments on two different large-scale real-world datasets.
The remainder of this paper is formed as follows. Section 2 reviews previous related works. Section 3 detailly explains how our proposed POI recommendation approach works. In Section 4, we present the experiment results along with the corresponding analysis. Finally, this paper is concluded in Section 5.
In this section, we review various earlier studies on POI recommendation, collaborative deep learning based POI recommendation approach, and trust-enhanced models in recommendation systems.
POI recommendation
The majority POI recommendation approaches now used in the literature is based on collaborative filtering (CF) [18,28]. Numerous studies are beginning to use deep learning-based models since deep neural networks and graph neural networks have developed quickly and employed widely in POI recommendation [11,16]. Chen et al. [29] designed a Convolutional Embedding Model (CEM) by using traffic trajectory information through modeling the relative bearing of locations and predicted next locations. Yang et al. [30] conducted flashback operations in recurrent neural networks and leveraged spatiotemporal information to explore past hidden states, achieving excellent performance in location prediction. Zhao et al. [31] introduced a next POI recommendation approach which contains a state-based stacked RNN and a power-law attention mechanism where the semantic subsequences of POIs are automatically identified and POIs’ sequential patterns are discovered. Lian et al. [32] created a geography-aware recommendation model. The model used a geography encoder on the basis of the self-attention mechanism to depict the hierarchical gridding of respective GPS points, emphasizing the importance of informative negative samples and improving the use of geographic data. Zhang et al. [33] made use of regional information and proposed session-based graph attention network (SGANet). Besides, knowledge graph, random walk and matrix factorization methods are also introduced into POI recommendation [34,20,21].
Collaborative deep learning POI recommendation
Wang et al. [14] offer a novel deep learning strategy in a collaborative way to address the data sparsity and cold start problems existed in CF-based POI recommendation approaches. The strategy extracts a pertinent deep representation from the instance. In POI recommendation, the method can mine the latent features and implicit relationships between users (and locations) and merge them efficiently, in addition. In order to train hierarchically additive representations while incorporating heterogeneous features for deep feature representation learning, Yin et al. [16] introduced a novel collaborative deep learning model to mine POIs’ potential representation from heterogenous features and spatial-aware users’ personal tendencies from hierarchically additive representation. Molaei et al. [35] developed an end-to-end deep learning structure by incorporating latent social features into a collaborative filtering approach. The approach employed representation learning on the rating matrix to extract the hidden social features and utilized a novel cascade tree forest based deep learning approach.
Trust-enhanced POI recommendation
The effectiveness and reliability of POI recommendation system could be enhanced by users’ trust relationships information, as was described in the Introduction. In order to boost the accruacy of recommendation results, numerous studies have employed the trust relationships in POI recommendation in recent years. Some scholars proposed to use explicit trust score [10,12,36] and implicit trust score [9,10,11] to quantify the trust relationships. With the great advancement of deep neural network and graph neural network, many scholars noticed them and combined them with trust-enhanced POI recommendation. Yang et al. [37] introduced two different deep neural network modules, based on structural deep network and deep walk respectively, that combine embedded representations from social and geographic graph information to provide more accurate recommendation. In order to determine user similarity, Gao et al. [38] developed an integrated time-aware similarity calculation strategy base on collaborative filtering that takes into account three different factors: vector space similarity, time-aware matrix factorization, and propagated trust. In addition, Zang et al. [39] suggest a LSTM considering spatial-temporal decay and a periodic attention mechanism applying discrete Fourier Series to capture the customized behavioral patterns by examining the vector representations of places and taking into account the variety in human behaviors.
As the work mentioned above, both collaborative deep learning and the application of trust relationships can improve the POI recommendation model. However, no study has used collaborative deep learning methods which can integrate information efficiently in trust-enhanced POI recommendation. In our proposed work, we apply the CLIF [23] method to POI recommendation, and improve the structure and parameter settings of the original model, so that we can capture the trust relationship between users deeply, and design a collaborative deep learning recommendation model that can both integrate the rich information in user-POI graph and users’ social graph efficiently.
Trust enhanced POI recommendation model with collaborative learning
Overview
In this section, we detail our proposed model and introduce the corresponding techniques. The entire architecture of our proposed model is shown in Fig. 1. In the trust relationships embedding module, users’ social graph is fed into a GAT-based graph encoder to generate trust embedding. Then, the latent representation of users is chosen as trust-enhanced latent features with the guidance of users’ trust embedding. By training the entire model from beginning to finish, discrimination errors are backpropagated into the trust relationships embedding module, allowing trust embeddings to be updated to accurately reflect each user’s unique discriminative features.

Demonstration of our proposed model. On one hand, User-POI graph learning module can extract the user’s deep features and select the latent representations under the direction of trust embedding generated by the trust relationships embedding module. On the other hand, the embedding process is enriched with trust-enhanced properties through backpropagation of discrimination errors.
Table 1 presents related notions mentioned in this paper. Additionally, we introduce two key definitions as follows.
User Relationships Matrix is a matrix
POI Region Matrix is a matrix
Definitions of notations.
Definitions of notations.
In LBSN, users usually have rich personal information as input features. However, the Gowalla and Brightkite datasets [18] we used did not directly give the characteristics of the users. Therefore, we preprocess the check-in history to obtain user’s feature matrix. First, we divide the regions according to the geographical location of POI and divide them into M areas (the number of which is tested in Section 4) via K-means clustering. Then, we obtain the number of times each user visited each region through check-in records and POI region matrix L. Finally, the feature matrix
Trust relationships embedding module
With the development of GNN, GNN-based methods have been proposed in social recommendation, and they have been proved to achieve good performance thanks to the representation learning ability of GNN [19]. Among various GNN networks, GAT (Graph Attention Network) [17] is widely used in recommendation systems, including POI recommendation [20,21,22], and help to improve the recommendation performance due to its excellent framework, as it can capture the information of neighbors of the nodes well. Based on its ability to efficiently aggregate the edges in the graph, this paper tries to use a graph neural network with multi-head attention mechanism [17] as an encoder for deeper extraction of trust relationships.
As an initial step, we construct an initial embedding matrix of users
In Eq. (2),
Therefore, the update process can be described as:
After encoding, we need to apply a pairwise decoder which guarantees that the trust embeddings can well capture users’ trust relationships. The following is the objective function of this decoder:
In this module, we introduce trust embeddings to provide guidance and involve users’ associations into the learning and generating process of trust-enhanced latent features. Additionally, this module presents a link for the propagation of the discrimination error between the two modules. The purpose of the module is to create a specific mapping from the users’ feature matrix
We initially transform users’ feature matrix
For each user
In this section, we first introduce the settings for experiments, including datasets and evaluation metrics. Then, we select suitable parameters for our TECL approach. After that, we conduct ablation experiments to analyze whether the proposed approach’s recommendation performance is enhanced by the GAT autoencoder compared to the original GIN layer. At last, we carry out comparative studies on the same datasets to compare TECL’s performance with the state-of-the-art POI recommendation approaches.
Experimental setup
Datasets.
Datasets.
The Gowalla and Brightkite datasets, two well-known LBSN datasets, were employed in our experiment to evaluate the effectiveness of our proposed method. Both datasets were gathered from the actual world by Stanford Large Network Dataset [18] and are openly accessible. Along with the user’s check-in information, these two datasets also include the friend links between users and the geographic information (latitude and longitude) of POIs. We conduct experiments on two typical and popular cities, i.e., New York and Los Angeles, to more efficiently utilize the users’ social relationships and geographic grouping information of datasets. 70% of the data is used as the training set and the remaining data is used as the test set. Table 2 shows the detailed statistical information where the degree of sparsity is derived from the corresponding check-in records.
We adopt three metrics in common use on top-k recommendations to assess the effectiveness of our proposed model, namely precision, recall and RMSE. The precision shows how many actual recommended POIs a target user actually visited out of all possible POIs recommended by the approach, and the percentage of recommended POIs that a target user checked is the recall. Clearly, the approach performs better as the greater the values of the two measures. The RMSE (Root Mean Square Error) measures the difference between the predicted value and the true value. The lower the value, the better the performance of the approach. Let
For the parameters settings, we refer to state-of-the-art deep learning models for POI recommendation. The dimensionality of the users’ feature matrix, namely the number of divided regions M, has a significant impact on how well our proposed approach performs. Here, the number of recommended POIs K is set to 10, and M is set from 20 to 240 with the interval of 20. Moreover, the dimension of the trust-enhanced latent features N is set to 512. The LeakyReLU functions in GAT layer have an initial alpha of 0.2, and other LeakyReLU activation functions in the user-POI graph learning module has a negative slope of 0.1. In addition, the trade-off parameter
Figure 2 shows the results under various M on two datasets. It is demonstrated that, both precision and recall keep rising when M is relatively small. In the Gowalla dataset, both values still increases slowly until M = 160, and they reach the highest point when M is 200. In the Brightkite dataset, both values decrease between 100 and 140. Then, they reach their maximum at M = 160 and keep reducing after that. There are sudden decreases at values 180 (Gowalla dataset) and 140 (Brightkite dataset). This may be because when the numbers of these two specific regions are given, the discrimination of the frequency of users visiting each region, that is, users’ features, is reduced. For example, the frequency of regions visited by most users may be the same, and some regions have not been visited by most users. After the highest point, as the number of regions increases, the discrimination of the frequency of user visits to each region will further decrease. Therefore, we chose M = 200 and M = 160 for the Gowalla and Brightkite datasets, respectively.

Results under different M on New York in (a) Gowalla dataset and (b) Brightkite dataset.

Different embedding encoders on (a) New York and (b) Los Angeles in Gowalla dataset and (c) New York and (d) Los Angeles in Brightkite dataset.
As we mentioned before in Section 1, we replaced the original GIN layer with the GAT layer as an autoencoder to embed trust relationships. We did this based on that Petar et al. [17] points out GAT is able to better aggregate domain information, which allows the model to dynamically determine the influence between nodes. Some other researches [26,27] have also pointed out the advantages of GAT over GIN. In the GAT, each node focuses on its neighbor nodes to update its potential representation, which is said to be the generalization of the operation of the neighbor’s standard average pooling in the GIN [26,27]. To analyze whether the GAT autoencoder in our proposed approach outperforms the original GIN layer in [23], we execute corresponding experiments on the two datasets. As shown in Fig. 3, when GAT is used to embed trust relationships, both values of precision and recall are obviously improved compared with GIN over all number of recommendation in all datasets. In Los Angeles of Gowalla dataset, the improvement of precision and recall is not as much as in New York, which may be because the sparsity of social relationship data in Los Angeles of Gowalla dataset is higher than that in New York. In New York of Brightkite dataset, the introduction of GAT resulted in a greater performance improvement than in Los Angeles, possibly because there is more social data per user in New York, as shown in Table 2. GAT’s excellent ability to aggregate neighbor information can bring greater improvement in the case of sufficient data, and also has good performance in the case of sparse data. This demonstrates GAT’s efficient ability to extract information which allows it to work better than GIN on extracting the trust relationships from social connections in our proposed TECL approach.
Performance analysis
We make comparisons between our proposed TECL with several baselines which contain not only classical approaches but also latest and powerful approaches. The results demonstrate that our proposed model can extract and integrate trust relationships into POI recommendation more efficiently.

Comparisons on (a) New York and (b) Los Angeles in Gowalla dataset.

Comparisons on (a) New York and (b) Los Angeles in Brightkite dataset.

Comparisons of RMSE on different datasets.
UBCF (user-based collaborative filtering) [40]. UBCF computes the similarity for each pair of users and selects a neighborhood, then the rate information from users in the neighborhood becomes the basis for recommendations.
NBI (network-based inference) [41]. NBI generates personalized recommendations by introducing a novel weighting method and producing resource distribution via network-based resource-allocation dynamics to extract hidden information.
RTMF (real-time matrix factorization) [42]. RTMF utilizes matrix factorization to blend temporal influence with spatial influence and collect real-time information about POIs. It then mines the latent representations to produce suggestions.
GeoIE (geographical influence) [43]. GeoIE focuses on geographical location and models the unique geographical influence of POIs treating three factors to capture the asymmetry and high modifiability of geographical influence among different POIs.
TECF (trust-enhanced collaborative filtering) [11]. TECF merges similarity scores with users’ trust relationships via network representation learning based collaborative filtering, and integrates temporal and spatial effects into POI recommendation by a hybrid model. Furthermore, it generates recommendations by merging these factors.
CEPD (context-enhanced probabilistic diffusion) [4]. CEPD extracts social explicit as well as implicit trusts and integrates them with time influence via probabilistic diffusion, and then merges time-enhanced geographical diffusion to generate POI recommendations.
ImNext (iirregular interval attention and multi-task learning) [22]. ImNext aims to solve the problem of irregular interval of check-in sequence with a new irregular-interval attention module. In addition, it uses a graph attention network with integrated edge attention to handle hidden features, which establishing multiple subtasks for joint learning.
To assure the accuracy and fairness of the comparison experiments, we set the optimal parameters for each approach. The best performance of RTMF is acquired when the learning rate is set as 0.01 and the random fraction is set as 0.5. For GeoIE, the scaling parameter is set to 10 and the regularization coefficient is set to 0.02. According to [11], TECF gets the most accurate recommendation results when the weighting parameter of geographical and temporal influence is set to 0.2 and 0.1, respectively. For CEPD, the dimension of the latent factor is set to 50 for Gowalla and 70 for Brightkite, respectively. And for ImNext, the size of the sliding-window is set to 5, and the IrrAttention layer number is set to 4.
Figures 4 and 5 show the comparison results on various datasets. The findings indicate that the precision falls as the number of recommendation grows while the recall shows an opposite trend. For the precision metric, its values decrease as the number of recommendation increases. Traditional recommendation approaches, which do not consider any auxiliary information, such as UBCF and NBI, have the lowest values. GeoIE introduces geographic information in POI recommendation and thus achieves better accuracy. TECF and CEPD again improve the performance because they incorporate trust relationships into recommendation generation while preserving geographic information. And ImNext gain better performance benefit from the introduction of attention mechanism. Nevertheless, our proposed TECL method has the best performance regardless of the number of recommendation. Even when compared to the latest most powerful ImNext approach, the values of precision produced by TECL always perform better than ImNext for all number of recommendation. Contrary to precision, as the number of recommendation rises, the values of recall increase. Traditional UBCF and NBI are lower than other approaches and increase slowly. The latest ImNext approach is seen as the strongest competitor of TECL approach. Even so, our proposed TECL approach obtains the maximum values compared to other baselines.
The reason why the gap between the recall and precision values of the proposed model and the state-of-the-art methods (such as CEPD and ImNext) is narrowing with the number of recommendations increasing may be that when the number of recommendations becomes larger, the number of POIs visited by some users will be smaller than the number of recommended POIs, thus, the performance is mainly affected by the accuracy of users who have higher number of actually visited POIs than the current number of recommended POIs. Due to the high sparsity of the datasets, there will be more users who have visited fewer POIs. Therefore, the performance improvement of our proposed model is becoming less obvious compared to the latest CEPD and ImNext methods when the number of recommended POIs becomes larger. Even when the number of recommendations K = 12, compared with the latest methods ImNext, though the values of precision and recall are very close, our proposed model still has a slight performance advantage. The reason our proposed TECL approach has such performance is that our approach applies collaborative deep learning to learn users’ latent representation efficiently and integrates graph attention network to deeply mine the trust relationships of users. By employing the end-to-end fashion, our approach embeds trust relationships to guide the recommendation, and optimizes the trust relationship mining process through error propagation. Finally, it generates more satisfying POI recommendation for target users.
In Fig. 6, the comparison results of RMSE metrics between our approach and the baselines are presented. The value of the RMSE decreases as the number of recommendations increases. Our proposed approach have the lowest RMSE among different number of recommendations even compared to the state-of-the-art methods. It can be figured out that the difference of the RMSE between ImNext and CEPD is a little larger than the difference between the previous methods, probably duing to the introduction of graph attention networks. ImNext designed an irregularly attention module and used a graph attention network to learn hidden features of the users and POIs. Our approach performs better than ImNext because TECL not only uses graph attention network, but also adopts a MLP network to extract users’ latent features. In addition, the introduction of collaborative learning strategy can ensure that the learning process of the two modules complements each other.

The objective convergence study on four datasets.
Figure 7 shows the values of loss function when our proposed TECL model is trained and the number of recommendations K is 10. In the first few epoches, the values of loss function decreases rapidly, then the speed becomes slowly, and the values of loss function stabilize finally. In the Gowalla – Los Angeles dataset, the decline rate of loss value is slower than that of the Gowalla – New York dataset, which may be due to that the user features on the New York dataset are more discriminative, compared to the Los Angeles dataset, which is important for deep learning.
In this paper, to recommend POIs in a more effective way, we present a trust-enhanced multi-graph recommendation model, which can efficiently extract and fuse trust relationships to strengthen the performance of POI recommendations via the collaborative learning strategy. We employ a graph attention network as a graph autoencoder to embed users’ trust relationships and generate users’ trust embeddings. Then, we merge users’ potential representations with the trust embeddings in a bidirectional way through backpropagation of discrimination errors. At last, we use a classifier to generate K recommended POIs that a target user has the maximum probability to check in. Experiments on different real-world datasets show that our proposed TECL method outperforms state-of-the-art approaches.
In the future, we will explore how to obtain and integrate the travelling distance between POIs which are not contained in the original datasets. The travelling distance can reflect the true geographical distribution of POIs and helps to apply more suitable clustering methods (such as, fuzzy c-means, etc.). And we will also focus on extending our approach to solve the distrust relationships, and the distrust relationships is becoming popular in recommendation system [24,25]. Besides, integrating temporal influence will also be our next goal as it is a state-of-the-art field and has an impact on the performance.
Footnotes
Acknowledgments
This work was supported by the Natural Science Foundation of Jiangsu Province (No. BK20242084) and the National Natural Science Foundation of China (No. 62402496).
