Guided node graph convolutional networks for repository recommendation

Abstract

Knowledge graph (KG) has been widely used in the field of recommender systems. There are some nodes in KG that guide the occurrence of interaction behaviors. We call them guided nodes. However, the current application doesn’t take into account the guided nodes in KG. We explore the utility of guided nodes in KG. It is applied in repository recommendations. In this paper, we propose an end-to-end framework, namely Guided Node Graph Convolutional Network (GNGCN), which effectively captures the connections between entities by mining the influence of related nodes. We extract samples of each entity in KG as their guided nodes and then combine the information and bias of the guided nodes when computing the representation of a given entity. The guided nodes can be extended to multiple hops. We evaluate our model on a real-world Github dataset named Github-SKG and music recommendation dataset, and the experimental results show that the method outperforms the recommendation baselines and our model is much lighter than others.

Keywords

Repository recommendation knowledge graphs guided nodes graph convolutional network graph attention network

1. Introduction

Currently, users are overwhelmed by the overload of online information in Web applications ranging from search engines and e-commerce to social media sites and news portals. To address the information overload problem, recommender systems (RS) [32] are widely used to guide users to discover products or services of interest to them in a personalized manner from a large number of possible alternatives. Due to their importance in practice, recommender systems have been receiving attention from both industry and academic research communities. It is no exaggeration to say that almost all services that provide content to users are equipped with recommender systems. Primarily, user preferences are predicted in RS from widely available sources of user behavior data. There are many application areas for recommender systems, such as movies, music, books, news, etc. But, recently there are new expansions in recommendation-related directions that people are trying to research recommendations of papers, algorithms, APIs, etc. Recommendation for Github repositories is also one of them but few people do it. The most recent one we found is in [25]. They investigate to automatically classify Github repositories. The conceived approach is able to recommend GitHub topics.

The traditional recommendation technique is collaborative filtering (CF) [13, 14, 31], which is based on the assumption that people who have purchased similar goods in the past tend to make similar choices in the future [20]. However, CF-based approaches are usually limited in that they all assume that users have the same motivation to purchase goods, ignoring the fact that the formation of a true recommendation system is usually a complex, heterogeneous process driven by the interaction of multiple underlying components. Researchers often turn to consider the relationships between entities to build feature-rich scenarios. User and item attributes are used to compensate for the sparse home item interactions and cold starts of CF-based approaches, and to improve the performance of recommendations [27, 8].

The user-item interaction is uniformly represented by the edges in the user-item bipartite graph, but users may have a variety of motivations for purchasing items. For example, some people like cost-effective goods, and others like eye-catching appearance. The user-item interaction is not controlled by only one motivation. Therefore, indiscriminately treating all motivations will inevitably lose some valuable information. Considering the differences in motivations, current researches [17, 29, 36, 37, 39] often use knowledge graphs (KGs) [7, 6] consisting of attributes and entities, where nodes correspond to entities (items or item attributes) and edges correspond to relationships. The rich semantic relatedness among nodes in the knowledge graph, in which various relationships help to mine motivations rationally, and the knowledge graph also brings interpretability to the recommender system. This can capture more complex interaction features, reflect user preferences comprehensively, and provide more accurate recommendation cues.

Despite so many advantages already mentioned above, there is still considerable potential for utilizing KG in RS. Inspired by real-world scenarios where users’ purchases for goods may be guided by salesmen, we consider whether such a guidance effect exists in knowledge graphs. Guidance means that users are influenced to ignore negative attributes of the product and focus on the positive attributes because they are guided.

Our work is to make repository recommendations based on guided nodes. In the current Github, most of the user actions are finding projects and asking questions under their favorite projects. The repository recommendations can facilitate the spread of projects. Users can acquire knowledge based on interest and not just on-demand. Guided nodes in repository recommendations mean that a beginner will be guided by star users and star projects. In this regard, repository recommendations to beginners based on the guided node are better than normal recommendations. The goal of our design is to tap the influence of user-related and item-related attributes and nodes to effectively capture the connection between entities. We construct the model to explore its impact on recommendation effectiveness. In this paper, we propose the Guided Node Graph Convolutional Network (GNGCN) for repository recommendation. The core idea of GNGCN is to compute the representation of a given entity in the knowledge graph by non-uniformly sampling (importance sampling) its neighboring nodes and treating these nodes as guided nodes. This design has two advantages: (1) we consider the importance of nodes in the knowledge graph for guiding entities, (2) when considering the construction of the knowledge graph for guiding nodes, we add user-side knowledge graph attributes to enrich the information for recommendations. In the actual knowledge graph, the number of neighbors of an entity is different, and in some cases, it may be very large. Therefore, we set a fixed number of guided nodes for each node, which makes the cost of GNGCN manageable. The set of guided nodes for a given entity can also be extended to multi-hops layer by layer. In this way, we build a higher-order entity dependency model that captures the potential remote interest and influence of users. Empirically, we evaluate GNGCN to a real-world dataset: Github-SKG. In order to prove that our model has a wider impact, we also verify the effectiveness of our model on Last.FM(music) dataset. Experimental results show that our model outperforms the state-of-the-art baselines.

Our contributions in this paper are summarized as follows:

•
We propose the Guided Node Graph Convolution Network (GNGCN), an end-to-end framework to better model real-world scenarios and explore the impact of guided nodes for repository recommendation. GNGCN captures the higher-order personalized interests of users by learning the graph representation of each entity in KG.
•
We demonstrate the superiority of our model through extensive experiments. The results show that GNGCN outperforms the state-of-the-art baselines.

Figure 1.
An example of the knowledge graph.

2. Related work

2.1 Collaborative knowledge graph

In recommendation scenarios, we usually have historical user-item interactions (e.g., purchases and clicks). Here we represent the interaction data as a user-item bipartite graph. In addition to interactions, we have additional information about items (e.g., item attributes and external knowledge). Typically, such auxiliary data consists of real-world entities as a knowledge graph. Conceptually, our approach is influenced by the collaborative knowledge graph (CKG) [30], which is defined by Wang et al. and encodes user behavior and project knowledge into a unified relational graph. CKG first represents each user behavior as a triple, where the edges of the triples represent the additional relational interactions between users and items, and then seamlessly integrate the user-item graph with the knowledge graph into a unified graph based on the alignment of item entity pairs.

CKG adds the user project bipartite graph to the knowledge graph on the project side to build a new knowledge graph, but it does not build the knowledge graph on the user side, which means that there is a lot of valuable information on the user side that has not been utilized. The high-level connectivity of some users and projects in the topology is also more difficult for us to obtain. The similarity and user preference between user vector and item vector are also partially lost. We have expanded the CKG to include a user-side knowledge graph. Our knowledge graph is a unified knowledge graph that includes a user-side knowledge graph, item-side knowledge graph, and user-item bipartite graph.

We show part of the structure of the knowledge graph we built in Fig. 1. Many real-world websites and platforms have this network structure in Fig. 1, the most obvious being social networks, and others are shopping sites and knowledge-sharing platforms.

2.2 Graph convolutional network and graph attention network

Our method is conceptually inspired by GCN. Graph convolutional networks [4, 16], have been attracting considerable attention lately, because of their remarkable success in various graph analysis tasks. The early attempts [4, 16] to derive a graph convolutional layer are based on graph spectral theory, graph Fourier transformation [24] in particular. Then polynomial spectral filters are used to greatly reduce the computational cost [9], and the usage of a linear filter makes further simplification [18]. Along with spectral graph convolution, directly performing graph convolution in the spatial domain is also investigated [10, 1]. Later the attention mechanism [2] is employed to adaptively specify weights to the neighbors of a node when performing spatial convolution. DisenGCN [22] is proposed to learn disentangled node representations, which employs a novel neighborhood routing mechanism to find the factor that may have caused the edge from a given node to one of its neighbors. However, DisenGCN is a homogeneous graph representation learning method, which does not distinguish the different importance among latent components meanwhile.

The traditional graph convolution neural network does not pay attention to the interaction between nodes when aggregate vectors represent and capture the high-order connectivity of topology. This feature is valuable in improving the effectiveness of recommendations.

The GCN used in our model is to biasedly aggregate and merge neighborhood information when computing the representation of a given entity in KG. This design has two advantages over traditional GCNs: (1) The local neighborhood structure is successfully captured and stored in each entity by the neighborhood aggregation operation. (2) Neighborhoods are weighted according to connected relationships and user-specific ratings, which reflect both the semantic information of the KG and the user’s personalized interest in the relationship.

The use of Graph Attention Network (GAT) [26] can effectively improve the effectiveness of recommendations and is a very hot direction at the moment. The innovation and application of GAT is still a hot topic, and many papers [30, 33, 34, 38, 11] are created based on GAT.

Our method also connects to GAT. GAT uses an attention mechanism to reweight the existing edges of the given graph. Since the topological structure of the graph is not changed, the model is prone to be affected by noisy data when edges are sparse. While GAT reduces immediate neighbors iteratively to explore the graph structure in a breadth-first approach that processes all nodes and edges at each step, our method uses multi-step neighborhood samples to explore the graph structure.

2.3 Repository recommendation

There are a few recommendations for Github repositories other work also brings us a lot of inspiration. For example, sentiment analysis of Github repositories [35], natural language processing [21], prediction [40], and exploring developer influence [3]. The most recent model about GitHub repositories recommendation we found is Multinomial Naïve Bayesian Network (MNB) [25]. But our focus is on a different entry point. In MNB they investigate the application of MNB to automatically classify GitHub repositories. By analyzing the README file(s) of the repository to be classified and the source code implementing it, the conceived approach is able to recommend GitHub topics.

MNB method is a kind of fuzzy recommendation in recommendation application. MNB uses text information as the main information for a recommendation. After natural language processing, more information is lost, so it is difficult to fully express the user’s preference. And it can only vaguely recommend topics for users. Our model is used to accurately recommend specific repositories for users.

We use the knowledge graph to construct a representation of the partial Github community and make library recommendations based on it. Compared with MNB’s approach, our recommendation model uses much richer information. Our recommendations are based on user behavior and interests. This is more in line with personalization.

Figure 2.

Illustration of the proposed GNGCN model. (a) The framework of GNGCN. (b) The example of two nodes, user node $u_{1}$ and item node $i_{3}$ , is given to illustrate the multi-hop propagation of GNGCN between neighbors.

3. The proposed model

In this section, we first introduce the proposed Guided Node Graph Convolution Network model. We formulate the knowledge graph-based recommendation problem in the first part. Then in the second part, the design scheme of the single-layer Guided Node Graph Convolution Network model is given. Finally, the complete learning algorithm of the Guided Node Graph Convolution Network model is presented. We give the overall framework of our model in Fig. 2a and b shows a model extension to a multi-hop illustration.

Figure 2, as our model framework, shows the structure of a complete recommendation system, which can be used not only for our repository recommendations but also for areas with a similar knowledge graph structures, such as social networks. In social networks, users are more likely to be influenced by key users and the structure and knowledge graph of social networks match very well, so communities such as social networks are also the ideal environment for the application of our model.

3.1 Task description

We formulate the knowledge-graph-aware recommendation problem as follows. In a typical recommendation scenario, We have a set of users $\mathcal{U}$ of size $M$ , $\mathcal{U}=\{u_{1},u_{2},\ldots,u_{M}\}$ . And a set of items $\mathcal{I}$ of size $N$ , $\mathcal{I}=\{i_{1},i_{2},\ldots,i_{N}\}$ . The user-item interaction matrix $Y\in\mathbb{R}^{M\times N}$ is defined according to users’ implicit feedback, where $y_{ui}=1$ indicates that user $u$ engages with item $i$ , such as clicking, browsing, or purchasing; otherwise $y_{ui}=0$ . Additionally, we also have a knowledge graph $\mathcal{G}$ , which is comprised of entity-relation-entity triples $(h,r,t)$ . Here $h\in\mathcal{E}$ , $r\in\mathcal{R}$ , and $t\in\mathcal{E}$ denote the head, relation, and tail of a knowledge triple, $\mathcal{E}$ and $\mathcal{R}$ are the set of entities and relations in the knowledge graph, respectively. Our knowledge graph incorporates a user-commodity bipartite graph and a collection of user-side attribute entities and relationships, i.e. $u\in\mathcal{E}$ and $i\in\mathcal{E}$ , which helps to better explore the hidden connectivity between users and commodities. For example, if user $u\in\mathcal{U}$ follows entity $e\in\mathcal{E}$ , $e$ is an attribute entity of user $u$ , $e$ is also the producer of item $i\in\mathcal{I}$ , and $e$ is also an attribute entity of item $i$ at the same time, and a hidden connection between $u$ and $i$ is constructed.

We have given several examples of mining hidden connections, as follows.

•
$u_{1}\to e_{2}\to\{i_{1},i_{3}\}$ ,
•
$u_{1}\to e_{2}\to u_{3}\to\{i_{4},i_{5}\}$ ,
•
$u_{1}\to i_{1}\to e_{1}\to\{i_{2},i_{6}\}$ ,
•
$u_{1}\to i_{1}\to e_{1}\to i_{2}\to\{u_{3},u_{4}\}$ .

Such as $u_{1}\to e_{2}\to\{i_{1},i_{3}\}$ means that user $u_{1}$ connects with entity $e_{2}$ and entity $e_{2}$ connects with item $i_{1}$ and $i_{3}$ . So, user $u_{1}$ can connect with item $i_{1}$ and $i_{3}$ by entity $e_{2}$ .

Given the user-item interaction matrix $Y$ as well as the knowledge graph $\mathcal{G}$ , our aim is to represent user $u$ and item $i$ by purposefully sampling the neighboring nodes of entities and representing them as such. We also aim to predict whether user $u$ has a potential interest in item $i$ with which he or she has had no interaction before. Our goal is to learn a prediction function $\hat{y}_{ui}=\mathcal{F}(u,i|\Theta,Y,\mathcal{G})$ , where $\hat{y}_{ui}$ denotes the probability that user $u$ will engage with item $i$ , and $\Theta$ denotes the model parameters.

Figure 3.
An example of a single-layer GNGCN structure. In this example a fixed sample of three neighbor nodes of item $i$ is taken.

3.2 GNGCN layer

We first describe a GNGCN layer in this subsection. The structure of single-layer GNGCN is shown in Fig. 3. The single-layer GNGCN layer performs only one convolution operation and aggregates neighborhood information once. As a detailed structure of a convolutional layer, this structure can also apply alternatives to graph convolutional operations in other models. This convolutional operation can be used in representation vector learning with a similar knowledge graph structure. For example, predict user behavior in social networks. Aggregating neighborhood information makes the representation vector more characteristic, can better express the preference of the vector, and can better explore the high-dimensional connectivity of the topology between the two vectors. The single-layer GNGCN first samples by the importance for the neighbors of the incoming nodes.

In the beginning, a user node $u$ and an item node $i$ enter the first GNGCN layer. We use $N^{e}$ to denote the set of entities directly connected to the entity node $u$ or $i$ and that enter the GNGCN layer. The sampling yields an embedding representation of a specified number of neighbors of the incoming node. Then we get the attention fraction $p i$ and the neighbor $e$ to do the dot product operation. Finally, the embedding representation of $i$ is aggregated with the embedding representation of neighbor $e$ to update the embedding representation of incoming contact $i$ to get $i^{\prime}$ .

In a real-world knowledge graph, the size of $N^{e}$ can be very large, and not all neighbor nodes are useful for user $u$ or item $i$ . Instead of using all the neighbors of an entity, we uniformly draw for each entity the set of fixed-size neighbor nodes with its highest ranking among the neighbors, which keeps the computational pattern of each batch fixed and more efficient. After importance sampling, we obtain the neighboring domain of each entity node respectively. We use $N_{\textit{sample}}^{e}$ to denote the set of sampled neighbors.

We have indicated our sampling process as follows:

$\displaystyle N_{\textit{sample}}^{e}=\{e|e\in N^{e},\textit{degree}(e)\in D\},$ (1)

where $e\in N^{e}$ is the representation of the entity directly connected to the node $u$ or $i$ . $\textit{degree}(e)$ denotes the degree of entity $e$ and $D$ denotes the set of $n$ highest degrees that $n$ is settable.

We use $r_{e_{v},e_{j}}$ to denote the relation between entity $e_{v}$ and $e_{j}$ . Then we use an inner product function $g$ to calculate the attention score between a user $u$ and a relation $r$ . The attention score between an item $i$ and a relation $r$ is calculated in the same way.

$\displaystyle\pi_{r_{i,e}}^{u}=g(u,r_{i,e}),$ (2) $\displaystyle\pi_{r_{u,e}}^{i}=g(i,r_{u,e}),$ (3)

where $\mathbf{u}\in\mathbb{R}^{d}$ , $\mathbf{i}\in\mathbb{R}^{d}$ and $\mathbf{r}\in\mathbb{R}^{d}$ are the representations of user $u$ , item $i$ and relation $r$ , respectively, $d$ is the dimension of representations. In general, $\pi_{r_{i,e}}^{u}$ characterizes the importance of relation $r$ to user $u$ and $r_{i,e}$ represents the relation between item $i$ and entity $e$ . For example, one user is concerned with the development language of a Github item, and another user is more concerned with the application area of a Github item. $\pi_{r_{u,e}}^{i}$ characterizes the importance of relation $r$ to item $i$ and $r_{u,e}$ represents the relation between user $u$ and entity $e$ . For example, a user follows a star user who is the author of this item, and another user’s favorite item is in the same direction as this item.

To characterize the topological proximity structure of item $i$ , we compute the linear combination of $i$ ’s neighborhood:

$\displaystyle i_{N_{\textit{sample}}^{e}}^{u}=\sum_{e\in N_{\textit{sample}}^{% e}}\tilde{\pi}_{r_{i,e}}^{u}e,$ (4)

where $\tilde{\pi}_{r_{i,e}}^{u}$ is the normalized user-relation score and $e$ is the representation of entity.

$\displaystyle\tilde{\pi}_{r_{i,e}}^{u}=\frac{\exp(\pi_{r_{i,e}}^{u})}{{% \textstyle\sum_{e\in N_{\textit{sample}}^{e}}\exp(\pi_{r_{i,e}}^{u})}},$ (5)

User-relation scores act as personalized filters when computing an entity’s neighborhood representation since we aggregate the neighbors with bias with respect to these user-specific scores. The linear combination representation of the neighborhood of user $u$ is similar to the linear combination representation of the neighborhood of item $i$ , and the item-relation score is calculated in the same way as the user-relation score.

It is the aggregation of the entity representation $i$ and its neighborhood representation $N_{\textit{sample}}^{e}$ into a single vector that is the final step of our GNGCN layer operation. In GNGCN we have implemented an aggregator $\textit{agg}:\mathbb{R}^{d}\times\mathbb{R}^{d}\to\mathbb{R}^{d}$ . This aggregator takes the summation of two representation vectors and then performs a nonlinear transformation:

$\displaystyle\textit{agg}=\sigma(W\cdot(i+i_{N_{\textit{sample}}^{e}}^{u})+b).$ (6)

This aggregator is also applied to the aggregation of users with their neighboring nodes. Aggregation is a key step in GNGCN because the representation of an item or user in GNGCN is bound to its neighbors by means of aggregation.

3.3 Learning algorithm

Through a single GNGCN layer, the final representation of an entity is dependent on itself and on the neighboring nodes directly connected to it, which is the first-order embedding representation of the entity. The 1st-order embedding representation of an entity is obtained by aggregating the initial representation of each entity (the 0th-order representation) with the initial embedding representation of its neighboring nodes, and then we can repeat this process to obtain the 2nd-order embedding representation of an entity, i.e., the 1st-order embedding representation of an entity and its neighboring nodes are aggregated to obtain the 2nd-order embedding representation of that entity. As shown in Fig. 2b. In this way, we complete the propagation in the knowledge graph.

It is worth mentioning that when we calculate the attention score of item $i$ on relation $r$ , we use the embedding representation of the n-th order of item $i$ after propagation aggregation. Equation (3) is changed as follows:

$\displaystyle\pi_{r_{u,e}}^{i^{\prime}}=g(i^{\prime},r_{u,e}),$ (7)

where $i^{\prime}$ is the n-th order embedding representation of item $i$ .

[h] Learning Algorithm of GNGCNinteraction matrix $Y$ ; knowledge graph $\mathcal{G}(h,r,t)$ ; neighborhood set $N^{e}$ ; the set of $n$ highest degrees: $D_{n}$ ; degree of $e$ : $\textit{degree}(e)$ trainable parameters: $\{e\}_{e\in\mathcal{E}},\{r\}_{r\in\mathcal{R}},\{W_{i},b_{i}\}_{i=1}^{H}$ hyper-parameters: $H,d,g(\cdot),f(\cdot),\sigma(\cdot),\textit{agg}(\cdot)$ Prediction function $\mathcal{F}(u,i;\Theta,Y,\mathcal{G})$ Initialize all parameters. GNGCN not converges $(u,i)$ in $Y$ Equation (1) obtain the receptive field $\{A[v]\}_{v=0}^{H}$ of $i$ and $\{B[v]\}_{v=0}^{H}$ of $u$ recursively $H$ times.

$h=1,\ldots,H$ $e\in A[h]$ $e_{N_{\textit{sample}}^{e}}^{u}[h-1]\longleftarrow\sum_{e^{\prime}\in N_{% \textit{sample}}^{e}}\tilde{\pi}_{r_{e,e^{\prime}}}^{u}e^{\prime u}[h-1]$ . $e^{u}[h]\longleftarrow\textit{agg}(e_{N_{\textit{sample}}^{e}}^{u}[h-1],e^{u}[% h-1])$ . $i^{\prime}\longleftarrow e^{u}[H]$ . $h=1,\ldots,H$ $e\in B[h]$ $e_{N_{\textit{sample}}^{e}}^{i^{\prime}}[h-1]\longleftarrow\sum_{e^{\prime}\in N% _{\textit{sample}}^{e}}\tilde{\pi}_{r_{e,e^{\prime}}}^{i^{\prime}}e^{\prime i^% {\prime}}[h-1]$ . $e^{i^{\prime}}[h]\longleftarrow\textit{agg}(e_{N_{\textit{sample}}^{e}}^{i^{% \prime}}[h-1],e^{i^{\prime}}[h-1])$ . $u^{\prime}\longleftarrow e^{i^{\prime}}[H]$ . Calculate predicted probability $\hat{y}_{ui}=f(u^{\prime},i^{\prime})$ Update parameters by gradient descent;.

Return: Prediction function $\mathcal{F}(u,i;\Theta,Y,\mathcal{G})$

The formal description of the above steps is presented in Algorithm 3.3. $H$ denotes the maximum depth of receptive field (or equivalently, the number of aggregation iterations), and a suffix $[tb]$ attached by a representation vector denotes $h$ -order. The receptive field is the set of neighbors of the specified node recursively $H$ times. For a given user-item pair ( $u, i$ ) (line 3), we first calculate the receptive field $A$ of $i$ and $B$ of $u$ in an iterative layer-by-layer manner (line 4). Then the aggregation is repeated $H$ times (line 6): In iteration $h$ , we calculate the neighborhood representation of each entity $e\in A[tb]$ (line 7) and $e\in B[tb]$ (line 14), then aggregate them with their own representation $e^{u}[h-1]$ and $e^{i^{\prime}}[h-1]$ to obtain the one to be used at the next iteration (line 8, 15). The final $H$ -order entity representation is denoted as $i^{\prime}$ (line 11), which is fed into a function $f:R^{d}\times R^{d}\to R$ together with the final $H$ -order representation of the user $u^{\prime}$ (line 18) for predicting the probability:

$\displaystyle\hat{y}_{ui}=f(u^{\prime},i^{\prime}).$ (8)

Note that Algorithm 3.3 traverses all possible user-item pairs (line 3). To make computation more efficient, we use a negative sampling strategy during training. The complete loss function is as follows:

$\displaystyle\mathcal{L}=\sum_{u\in\mathcal{U}}\left(\sum_{i:y_{ui=1}}\mathcal% {J}(y_{ui},\hat{y}_{ui})-\sum_{i=1}^{T^{u}}\mathbb{E}_{i_{v}\sim P(i_{v})}% \mathcal{J}(y_{ui},\hat{y}_{ui})\right)+\lambda\parallel\mathcal{F}\parallel_{% 2}^{2},$ (9)

where $\mathcal{J}$ is cross-entropy loss, $P$ is a negative sampling distribution, and $T^{u}$ is the number of negative samples for user $u$ . In this paper, $T^{u}=|i:y_{ui=1}|$ and $P$ follows a uniform distribution. The last term is the L2-regularizer.

4. Experiments

In this section, we evaluate GNGCN on two real-world scenarios: Github-SKG and Last.FM.

4.1 Datasets construction

We utilize the following two datasets in our experiments for Github and music recommendation, respectively:

•
GitHub-SKG contains information about 2681 Github repositories and 4245 users who interact with these repositories.
•
Last.FM contains musician listening information from a set of 2 thousand users from Last.FM online music system.

Since the two datasets are explicit feedback, we transform it into implicit feedback where each entry is marked with 1 indicating that the user has rated the item positively, and sample an unwatched set marked as 0 for each user.

The construction of the Github-SKG dataset predates our GNGCN model. To put it differently, our model is inspired by the Github-SKG dataset we built. On the basis of our other work [5], we had the motivation to build our Github-SKG dataset. Our other work obtained information about more than 2000 algorithm papers and using these papers as seeds we obtained their relevant information on Github to build a knowledge base. This information includes the development language of the Github repository, the related domain, and the contributors of the repository. It constitutes the knowledge graph on the item side. Moreover, we obtained information about the users who collected these repositories to construct the user-item bipartite graph and the user-side knowledge graph. The user-side information contains the famous users followed by the users of the collection Github repository. We numbered the users, items, and entities in a unified way, and added the user-item bipartite graph to get the final knowledge graph. We excluded a portion of the interaction data for users who appeared less than three times in the entire interaction set. The behavioral value of these users is lower. Access to this information is mainly based on the API provided by Github. Our dataset has the potential to continue to be improved, the information in the knowledge graph can continue to be enriched although it may rely on external sources, and the size of the dataset can be further expanded.

About the Last.FM dataset we use Microsoft Satori 7 to construct the knowledge graph for each dataset. We first select a subset of triples from the whole KG with a confidence level greater than 0.9. Given the sub-KG, we collect Satori IDs of all valid musicians by matching their names with tail of triples (head, type.object.name, tail). Items with multiple matched or no matched entities are excluded for simplicity. We then match the item IDs with the head of all triples and select all well-matched triples from the sub-KG. When building the knowledge graph of Last.FM, we added the user-item bipartite graph into it. We renumbered the users as entities and constructed the triples ( $i, r, u$ ) to join the knowledge graph.

Table 1
Basic statistics for Github-SKG and Last.FM

Github-SKG Last.FM

Users 4245 1872

Items 2681 3846

Interactions 83450 42346

Entities 204305 11238

Relations 8 61

Triples 374718 36691

The basic statistics of the Github-SKG dataset are presented in Table 1.
4.2 Baselines

	Github-SKG	Last.FM
Users	4245	1872
Items	2681	3846
Interactions	83450	42346
Entities	204305	11238
Relations	8	61
Triples	374718	36691

We compare the proposed GNGCN with the following baselines, in which the first baseline is KG-free while the rest are all KG-aware methods. Hyper-parameter settings for baselines are introduced in the next subsection.

•
NCF [15]: NCF designs a CF model based on a neural network structure. The neural network structure is used to model the latent features of users and items, and the MLP is used to give NCF the ability to obtain higher-order nonlinear interactions.
•
FM [23]: This is a benchmark factorization model, where considers the second-order feature interactions between inputs. Here we treat IDs of a user, an item, and its knowledge (i.e., entities connected to it) as input features.
•
NFM [12]: The method is a state-of-the-art factorization model, which subsumes FM under a neural network. Specially, we employed one hidden layer on input features as suggested in [12].
•
KGCN [19]: Utilizes GCN to collect high-order neighborhood information from the KG. To find the neighborhood which the user may be more interested in, it uses the user representation to attend to different relations to calculate the weight of the neighborhood.
•
RippleNet [28]: A memory-network-like approach that represents the user by his or her related items. RippleNet uses all relevant entities in the KG to propagate the user’s representation for a recommendation.
•
KGAT [30]: A GNN-based recommendation model equipped with a graph attention network. It uses a hybrid structure of the knowledge graph and user-item graph as a collaborative knowledge graph. KGAT employs an attention mechanism to discriminate the importance of neighbors and outperforms several state-of-the-art methods.

Table 2
Hyper-parameter settings for Github-SKG and Last.FM

Hyper-parameter $K$ $d$ $H$ $\lambda$ $\eta$ Batch size

Github-SKG 4 8 1 $10^{-4}$ $10^{-2}$ 512

Last.FM 2 16 1 $10^{-4}$ $10^{-2}$ 32

4.3 Experiments setup

Hyper-parameter	$K$	$d$	$H$	$\lambda$	$\eta$	Batch size
Github-SKG	4	8	1	$10^{-4}$	$10^{-2}$	512
Last.FM	2	16	1	$10^{-4}$	$10^{-2}$	32

In GNGCN, we set functions $g$ and $f$ as inner product, $\sigma$ as ReLU for non-last-layer aggregator, and tanh for the last-layer aggregator. Other hyper-parameter settings are provided in Table 2. Where $K$ is neighbor sampling size, $d$ is the dimension of embeddings, $H$ is times of diffusion, $\lambda$ is L2 regularizer weight and $\eta$ is the learning rate. The hyperparameters are determined by optimizing AUC on a validation set. For each dataset, the ratio of training and test set is 4:1. Each experiment is repeated 3 times, and the average performance is reported. In click-through rate (CTR) prediction, we apply the trained model to predict each interaction in the test set. We use AUC(area under curve), ACC(accuracy), and F1-score(balanced F Score) to evaluate CTR prediction. The implementation environment of GNGCN is as follows:

•
Operating System: Ubuntu 18.04.3 LTS
•
RAM: 128GB DDR4 @ 3200MHz
•
CPU: Intel (R) Core (TM) i9-9980XE CPU @ 3.00GHz
•
GPU: 2 $\times$ NVIDIA TITAN RTX (Core Clock: 1.77GHz, Memory Size: 2 $\times$ 24GB)
•
SSD: 2.0 TB (NVM Express, PCIe 3.0 x16)
•
Software: NVIDIA CUDA 10.2, Python 3.7.6, Pytorch 1.3.1, and NumPy 1.18.1.

For KGAT, we set the depth to 2 and layer size to [16, 16]. For RippleNet, we set the number of hops to 2 and the sampling size to 64 for each dataset. For KGCN, we set the number of hops to 3, and the sampling size to 4 for GitHub-SKG, respectively. Other hyper-parameters are the same as reported in their original papers or as default in their codes.
4.4 Performance comparison

We first compare the recommendation performance of all methods. The results of CTR prediction and model sizes are presented in Table 3. We show the ROC curves of GNGCN and baselines in Fig. 4.

Our observations are as follows:

•
The performance of our model GNGCN shows some improvement compared to the baseline model. This demonstrates the effectiveness of our model GNGCN on recommendations.
•
Through the analysis of the parameters #, we can clearly see that our model is much smaller than other models, which shows that our model is lighter, the structure is relatively simple, and the operation efficiency is higher.
•
NCF performs worse than the other models on Github-SKG, which shows that the introduction of knowledge graphs is effective for the recommendation.

Table 3
The results of AUC, ACC, and F1 in CTR prediction

GitHub-SKG Last.FM

Model AUC ACC F1 Parameters # AUC ACC F1 Parameters #

NCF 0.658 ( $-$ 33.2%) 0.626 ( $-$ 34.3%) 0.553 ( $-$ 41.7%) 24.9M 0.654 ( $-$ 26.8%) 0.624 ( $-$ 24.4%) 0.591 ( $-$ 26.9%) 18.17M

FM 0.860 ( $-$ 13.0%) 0.799 ( $-$ 17.0%) 0.801 ( $-$ 16.9%) 13.07M 0.778 ( $-$ 14.4%) 0.727 ( $-$ 14.1%) 0.725 ( $-$ 13.5%) 0.7M

NFM 0.873 ( $-$ 11.7%) 0.809 ( $-$ 16.0%) 0.807 ( $-$ 16.3%) 13.08M 0.787 ( $-$ 13.5%) 0.747 ( $-$ 12.1%) 0.726 ( $-$ 13.4%) 0.7M

RippleNet 0.687 ( $-$ 30.3%) 0.637 ( $-$ 33.2%) 0.645 ( $-$ 32.5%) 0.84M 0.810 ( $-$ 11.2%) 0.745 ( $-$ 12.3%) 0.731 ( $-$ 12.9%) 3.37M

KGCN 0.717 ( $-$ 27.3%) 0.662 ( $-$ 30.7%) 0.685 ( $-$ 28.5%) 1.95M 0.800 ( $-$ 12.2%) 0.741 ( $-$ 12.7%) 0.730 ( $-$ 13.0%) 4.02M

KGAT 0.894 ( $-$ 9.6%) 0.829 ( $-$ 14.0%) 0.839 ( $-$ 13.1%) 13.02M 0.811 ( $-$ 11.1%) 0.727 ( $-$ 14.1%) 0.736 ( $-$ 12.4%) 1.24M

GNGCN 0.990 0.969 0.970 0.23M 0.922 0.868 0.860 0.14M

GNGCN ${}_{nd}$ 0.980 ( $-$ 1.0%) 0.966 ( $-$ 0.3%) 0.967 ( $-$ 0.3%) 0.23M 0.863 ( $-$ 5.9 %) 0.793 ( $-$ 7.5 %) 0.786 ( $-$ 7.4 %) 0.14M

GNGCN ${}_{nu}$ 0.957 ( $-$ 3.3%) 0.946 ( $-$ 2.3%) 0.946 ( $-$ 2.4%) 0.23M 0.873 ( $-$ 4.9 %) 0.809 ( $-$ 5.9 %) 0.800 ( $-$ 6.0 %) 0.14M

Figure 4.
The ROC curves of GNGCN and baselines.

•
RippleNet and KGCN did not perform as well as we expected on the GitHub-SKG dataset. After comparing their performance with KGAT, FM, and NFM on the GitHub-SKG and Last.FM, we get a reasonable explanation. The FM and NFM models we use are reproduced in the KGAT paper, and they have in common that they all use collaborative knowledge graphs. The same user-side knowledge and user-item bipartite graphs are added to our GitHub-SKG dataset knowledge graph and renumbered. The RippleNet and KGCN models we use are reproduced from the authors’ publicly available code, which does not use the collaborative knowledge graph. The user-side knowledge graph we added occupies most of the entire knowledge graph, and the renumbered users are added to the entity set. This information is treated as a huge volume of noise in RippleNet and KGCN, and it can be concluded that the knowledge graph still works compared to NCF, but makes the improvement brought by the knowledge graph small.

Figure 5.
AUC results of GNGCN with different number of aggregated neighbors $K$ .

Figure 6.
AUC result of GNGCN with different number of propagation $H$ .

In order to study the effectiveness of the components of our proposed GNGCN, we carried out ablation experiments, and the results are shown in the last two rows of Table 3. GNGCN ${}_{nd}$ is a variant of the GNGCN model without importance sampling. GNGCN ${}_{nu}$ is a variant of the GNGCN model that uses the initialized user embedding representation to not aggregate the user’s neighbors into the user’s embedding representation. From the results we find that:

•
Both GNGCN ${}_{nd}$ and GNGCN ${}_{nu}$ showed different degrees of degradation compared to GNGCN. This suggests the usefulness of importance sampling of neighbors and the use of aggregated item embeddings to guide the aggregated propagation of user embeddings.

Figure 7.
AUC results of KGCN with different dimensions of embedding.

•
The performance of GNGCN ${}_{nu}$ is less degraded compared to GNGCN ${}_{nd}$ . This indicates that our algorithm for importance sampling of the neighbors of entities needs further optimization and this direction is promising.

4.5 Parameter analysis

	GitHub-SKG	Last.FM
Model	AUC	ACC	F1	Parameters #	AUC	ACC	F1	Parameters #
NCF	0.658 ( $-$ 33.2%)	0.626 ( $-$ 34.3%)	0.553 ( $-$ 41.7%)	24.9M	0.654 ( $-$ 26.8%)	0.624 ( $-$ 24.4%)	0.591 ( $-$ 26.9%)	18.17M
FM	0.860 ( $-$ 13.0%)	0.799 ( $-$ 17.0%)	0.801 ( $-$ 16.9%)	13.07M	0.778 ( $-$ 14.4%)	0.727 ( $-$ 14.1%)	0.725 ( $-$ 13.5%)	0.7M
NFM	0.873 ( $-$ 11.7%)	0.809 ( $-$ 16.0%)	0.807 ( $-$ 16.3%)	13.08M	0.787 ( $-$ 13.5%)	0.747 ( $-$ 12.1%)	0.726 ( $-$ 13.4%)	0.7M
RippleNet	0.687 ( $-$ 30.3%)	0.637 ( $-$ 33.2%)	0.645 ( $-$ 32.5%)	0.84M	0.810 ( $-$ 11.2%)	0.745 ( $-$ 12.3%)	0.731 ( $-$ 12.9%)	3.37M
KGCN	0.717 ( $-$ 27.3%)	0.662 ( $-$ 30.7%)	0.685 ( $-$ 28.5%)	1.95M	0.800 ( $-$ 12.2%)	0.741 ( $-$ 12.7%)	0.730 ( $-$ 13.0%)	4.02M
KGAT	0.894 ( $-$ 9.6%)	0.829 ( $-$ 14.0%)	0.839 ( $-$ 13.1%)	13.02M	0.811 ( $-$ 11.1%)	0.727 ( $-$ 14.1%)	0.736 ( $-$ 12.4%)	1.24M
GNGCN	0.990	0.969	0.970	0.23M	0.922	0.868	0.860	0.14M
GNGCN ${}_{nd}$	0.980 ( $-$ 1.0%)	0.966 ( $-$ 0.3%)	0.967 ( $-$ 0.3%)	0.23M	0.863 ( $-$ 5.9 %)	0.793 ( $-$ 7.5 %)	0.786 ( $-$ 7.4 %)	0.14M
GNGCN ${}_{nu}$	0.957 ( $-$ 3.3%)	0.946 ( $-$ 2.3%)	0.946 ( $-$ 2.4%)	0.23M	0.873 ( $-$ 4.9 %)	0.809 ( $-$ 5.9 %)	0.800 ( $-$ 6.0 %)	0.14M

Knowledge graph propagation plays an important role in GNGCN. We investigate the effect of its two parameters in knowledge graph propagation, the number of aggregated neighbors $K$ and the number of propagation $H$ , on the performance of GNGCN and analyze the effect of embedding dimensionality $d$ on the performance of GNGCN.

•
Impact of the number of aggregated neighbors $K$ . We vary the size of sampled neighbor $K$ to investigate the efficacy of usage of the KG. From Fig. 5 we observe that GNGCN achieves the best performance when $K=4$ or 2. This is because a too small $K$ does not have enough capacity to incorporate neighborhood information, while a too-large $K$ is prone to be misled by noises.
•
Impact of the number of propagation $H$ . We investigate the influence of the number of propagation $H$ in GNGCN by varying $H$ from 1 to 4. The results are shown in Fig. 6, which demonstrates that GNGCN is more sensitive to $H$ compared to $K$ . We observe the occurrence of serious model collapse when $H=4$ , as a larger $H$ brings massive noises to the model. This is also in accordance with our intuition since a too-long relation-chain makes little sense when inferring inter-item similarities. An $H$ of 1 is enough for real cases according to the experiment results.
•
Impact of the dimension of embedding $d$ . Lastly, we examine the influence of the dimension of embedding $d$ on the performance of GNGCN. The embedding dimension is also a key parameter to control the complexity and capacity of GNGCN. Therefore, we evaluated its impact on recommendation performance. In general, as the embedding dimension gradually increases, the recommendation performance also grows, because larger dimensions enhance the representation capacity. However, the increase affects the performance when it is larger than the optimal value. Therefore, we use the appropriate embedding dimension to balance the trade-off between performance and complexity. The result in Fig. 7 is rather intuitive: Increasing $d$ initially can boost the performance since a larger $d$ can encode more information of users and entities, while a too-large $d$ adversely suffers from overfitting.
•
For the noise in the data, mainly reflected in the model learning process of aggregated neighborhood information update the embedding representation, the number of aggregated neighbors, convolution, or too large dimensions, too large amount of data information will dilute the characteristics of the project or the user’s initial embedding representation, so that the aggregated embedding vector can not represent the entity. RippleNet and KGCN adopt random aggregation, and in the complexity of our data set, it is easily aggregated to weakly related data information, which also dilutes the characteristics of the entity. We consider this to be the main source of noise in the data.

5. Conclusion and future work

In this paper, we propose a guided node knowledge graph convolutional network for repository recommendation. GNGCN extends the nonspectral GCN approach to knowledge graphs by selectively and biasedly aggregating neighborhood information to learn both the structural information of KGs and the personalized and latent interests of users. We also implement the proposed method in a small batch, which can operate on large datasets and knowledge graphs. Through extensive experiments on a real-world dataset, GNGCN consistently outperforms the state-of-the-art baseline for Github and music recommendations. Not only that, compared with other baselines, our model is lighter in scale and better in running time and efficiency, which should be considered frequently in our future work.

We point out two avenues for future work. (1) In this work, we construct the perceptual domain of an entity based on the importance of sampling its neighborhood nodes. Exploring personalized samplers (e.g., attentional mechanism sampling) is an important direction for future work. (2) Considering the evolution of people’s interests and the influence or importance of nodes is an interesting direction for future work to investigate whether time series on recommendations can help improve the performance of recommendations.

Footnotes

Acknowledgments

This research was supported by the National Key Research and Development Plan of China (No. 2018YFB1003804).

References

Atwood

and Towsley

, Diffusion-convolutional neural networks, in: Lee

D.D.

Sugiyama

von Luxburg

Guyon

and Garnett

, eds, Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5–10, 2016, Barcelona, Spain, 2016, pp. 1993–2001.

Bahdanau

Cho

and Bengio

, Neural machine translation by jointly learning to align and translate, in: Bengio

and LeCun

, eds, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings, 2015.

Bana

and Arora

, Influence indexing of developers, repositories, technologies and programming languages on social coding community github, in: Aluru

Kalyanaraman

Bera

Kothapalli

Abramson

Altintas

Bhowmick

Govindaraju

Sarangi

S.R.

Prasad

S.K.

Bogaerts

Saxena

and Goel

, eds, 2018 Eleventh International Conference on Contemporary Computing, IC3 2018, Noida, India, August 2–4, 2018, IEEE Computer Society, 2018, pp. 1–6.

Bruna

Zaremba

Szlam

and LeCun

, Spectral networks and locally connected networks on graphs, in: Bengio

and LeCun

, eds, 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14–16, 2014, Conference Track Proceedings, 2014.

Cao

Shi

Wang

Yan

and Chen

, DEKR: description enhanced knowledge graph for machine learning method recommendation, in: Diaz

Shah

Suel

Castells

Jones

and Sakai

, eds, SIGIR ’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11–15, 2021, ACM, 2021, pp. 203–212.

Cao

Hou

and Liu

, Neural collective entity linking, in: Bender

E.M.

Derczynski

and Isabelle

, eds, Proceedings of the 27th International Conference on Computational Linguistics, COLING 2018, Santa Fe, New Mexico, USA, August 20–26, 2018, Association for Computational Linguistics, 2018, pp. 675–686.

Cao

Hou

Liu

Chen

and Dong

, Joint representation learning of cross-lingual words and entities via attentive distant supervision, in: Riloff

Chiang

Hockenmaier

and Tsujii

, eds, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31–November 4, 2018, Association for Computational Linguistics, 2018, pp. 227–237.

Cheng

Koc

Harmsen

Shaked

Chandra

Aradhye

Anderson

Corrado

Chai

Ispir

Anil

Haque

Hong

Jain

Liu

and Shah

, Wide & deep learning for recommender systems, in: Karatzoglou

Hidasi

Tikk

Shalom

O.S.

Roitman

Shapira

and Rokach

, eds, Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, DLRS@RecSys 2016, Boston, MA, USA, September 15, 2016, ACM, 2016, pp. 7–10.

Defferrard

Bresson

and Vandergheynst

, Convolutional neural networks on graphs with fast localized spectral filtering, in: Lee

D.D.

Sugiyama

von Luxburg

Guyon

and Garnett

, eds, Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5–10, 2016, Barcelona, Spain, 2016, pp. 3837–3845.

10.

Duvenaud

Maclaurin

Aguilera-Iparraguirre

Gómez-Bombarelli

Hirzel

Aspuru-Guzik

and Adams

R.P.

, Convolutional networks on graphs for learning molecular fingerprints, in: Cortes

Lawrence

N.D.

Lee

D.D.

Sugiyama

and Garnett

, eds, Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada, 2015, pp. 2224–2232.

11.

Guo

and Yan

, Collaborative filtering: Graph neural network with attention, in: Wang

Lin

Hendler

J.A.

Song

and Liu

, eds, Web Information Systems and Applications – 17th International Conference, WISA 2020, Guangzhou, China, September 23–25, 2020, Proceedings, Vol. 12432 of Lecture Notes in Computer Science, Springer, 2020, pp. 428–438.

12.

and Chua

, Neural factorization machines for sparse predictive analytics, in: Kando

Sakai

Joho

de Vries

A.P.

and White

R.W.

, eds, Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Tokyo, Japan, August 7–11, 2017, ACM, 2017, pp. 355–364.

13.

Song

Liu

Jiang

and Chua

, NAIS: neural attentive item similarity model for recommendation, IEEE Trans. Knowl. Data Eng. 30(12) (2018), 2354–2366.

14.

Liao

Zhang

Nie

and Chua

, Neural collaborative filtering, in: Barrett

Cummings

Agichtein

and Gabrilovich

, eds, Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, April 3–7, 2017, ACM, 2017, pp. 173–182.

15.

Liao

Zhang

Nie

and Chua

, Neural collaborative filtering, CoRR, abs/1708.05031, 2017.

16.

Henaff

Bruna

and LeCun

, Deep convolutional networks on graph-structured data, CoRR, abs/1506.05163, 2015.

17.

Huang

Zhao

W.X.

Dou

Wen

and Chang

E.Y.

, Improving sequential recommendation with knowledge-enhanced memory networks, in: Collins-Thompson

Mei

Davison

B.D.

Liu

and Yilmaz

, eds, The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08–12, 2018, ACM, 2018, pp. 505–514.

18.

Kipf

T.N.

and Welling

, Semi-supervised classification with graph convolutional networks, in: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings, OpenReview.net, 2017.

19.

Kojima

Ishida

Ohta

Iwata

Honma

and Okuno

, kgcn: A graph-based deep learning framework for chemical structures. J. Cheminformatics 12(1) (2020), 32.

20.

Koren

Bell

R.M.

and Volinsky

, Matrix factorization techniques for recommender systems, Computer 42(8) (2009), 30–37.

21.

Kourtzanidis

Chatzigeorgiou

and Ampatzoglou

, Reposkillminer: Identifying software expertise from github repositories using natural language processing, in: 35th IEEE/ACM International Conference on Automated Software Engineering, ASE 2020, Melbourne, Australia, September 21–25, 2020, IEEE, 2020, pp. 1353–1357.

22.

Cui

Kuang

Wang

and Zhu

, Disentangled graph convolutional networks, in: Chaudhuri

and Salakhutdinov

, eds, Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9–15 June 2019, Long Beach, California, USA, Vol. 97 of Proceedings of Machine Learning Research, PMLR, 2019, pp. 4212–4221.

23.

Rendle

Gantner

Freudenthaler

and Schmidt-Thieme

, Fast context-aware recommendations with factorization machines, in: Ma

Nie

Baeza-Yates

Chua

and Croft

W.B.

, eds, Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, July 25–29, 2011, ACM, 2011, pp. 635–644.

24.

Shuman

D.I.

Narang

S.K.

Frossard

Ortega

and Vandergheynst

, The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains, IEEE Signal Process. Mag. 30(3) (2013), 83–98.

25.

Sipio

C.D.

Rubei

Ruscio

D.D.

and Nguyen

P.T.

, A multinomial naïve bayesian (MNB) network to automatically recommend topics for github repositories, in: Li

Jaccheri

Dingsøyr

and Chitchyan

, eds, EASE ’20: Evaluation and Assessment in Software Engineering, Trondheim, Norway, April 15–17, 2020, ACM, 2020, pp. 71–80.

26.

Velickovic

Cucurull

Casanova

Romero

Liò

and Bengio

, Graph attention networks, CoRR, abs/ 1710.10903, 2017.

27.

Wang

Zhang

Hou

Xie

Guo

and Liu

, SHINE: signed heterogeneous information network embedding for sentiment link prediction, in: Chang

Zhai

Liu

and Maarek

, eds, Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM 2018, Marina Del Rey, CA, USA, February 5–9, 2018, ACM, 2018, pp. 592–600.

28.

Wang

Zhang

Wang

Zhao

Xie

and Guo

, Ripplenet: Propagating user preferences on the knowledge graph for recommender systems, in: Cuzzocrea

Allan

Paton

N.W.

Srivastava

Agrawal

Broder

A.Z.

Zaki

M.J.

Candan

K.S.

Labrinidis

Schuster

and Wang

, eds, Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Torino, Italy, October 22–26, 2018, ACM, 2018, pp. 417–426.

29.

Wang

Zhang

Xie

and Guo

, DKN: deep knowledge-aware network for news recommendation, in: Champin

Gandon

F.L.

Lalmas

and Ipeirotis

P.G.

, eds, Proceedings of the 2018 World Wide Web Conference on World Wide Web, WWW 2018, Lyon, France, April 23–27, 2018, ACM, 2018, pp. 1835–1844.

30.

Wang

Cao

Liu

and Chua

, KGAT: knowledge graph attention network for recommendation, in: Teredesai

Kumar

Rosales

Terzi

and Karypis

, eds, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4–8, 2019, ACM, 2019, pp. 950–958.

31.

Wang

Feng

and Chua

, Neural graph collaborative filtering, in: Piwowarski

Chevalier

Gaussier

É.

Maarek

Nie

and Scholer

, eds, Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, July 21–25, 2019, ACM, 2019, pp. 165–174.

32.

Wang

Cao

and Chua

, Explainable reasoning over knowledge graphs for recommendation, in: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27–February 1, 2019, AAAI Press, 2019, pp. 5329–5336.

33.

Xie

Liang

Duan

and Chen

, Net2: A graph attention network method customized for pre-placement net length estimation, in: ASPDAC ’21: 26th Asia and South Pacific Design Automation Conference, Tokyo, Japan, January 18–21, 2021, ACM, 2021, pp. 671–677.

34.

Wang

Zhou

Dong

Huo

and Ren

, Graph attention networks for new product sales forecasting in e-commerce, in: Jensen

C.S.

Lim

Yang

Lee

Tseng

V.S.

Kalogeraki

Huang

and Shen

, eds, Database Systems for Advanced Applications – 26th International Conference, DASFAA 2021, Taipei, Taiwan, April 11–14, 2021, Proceedings, Part III, Vol. 12683 of Lecture Notes in Computer Science, Springer, 2021, pp. 553–565.

35.

Yang

Wei

and Liu

, Sentiments analysis in github repositories: An empirical study, in: 24th Asia-Pacific Software Engineering Conference Workshops, APSEC Workshops 2017, Nanjing, China, December 4–8, 2017, IEEE, 2017, pp. 84–89.

36.

Ren

Sun

Sturt

Khandelwal

Norick

and Han

, Personalized entity recommendation: a heterogeneous information network approach, in: Carterette

Diaz

Castillo

and Metzler

, eds, Seventh ACM International Conference on Web Search and Data Mining, WSDM 2014, New York, NY, USA, February 24–28, 2014, ACM, 2014, pp. 283–292.

37.

Zhang

Yuan

N.J.

Lian

Xie

and Ma

, Collaborative knowledge base embedding for recommender systems, in: Krishnapuram

Shah

Smola

A.J.

Aggarwal

C.C.

Shen

and Rastogi

, eds, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13–17, 2016, ACM, 2016, pp. 353–362.

38.

Zhang

Zhuang

Zhu

Shi

Xiong

and He

, Relational graph neural network with hierarchical attention for knowledge graph completion, in: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7–12, 2020, AAAI Press, 2020, pp. 9612–9619.

39.

Zhao

Yao

Song

and Lee

D.L.

, Meta-graph based recommendation fusion over heterogeneous information networks, in: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13–17, 2017, ACM, 2017, pp. 635–644.

40.

Zhou

Ravi

Muñiz

C.M.

Azizi

Ness

de Melo

and Kapadia

, Gitevolve: Predicting the evolution of github repositories, CoRR, abs/2010.04366, 2020.