Abstract
Multi-behavior recommendation models excel in extracting abundant information from user-item interactions to enhance performance; however, they encounter challenges in accuracy due to noise disturbance and ambiguous weight allocation. In this paper, we propose cd-MBRec, a novel model designed to amplify commonality among various behaviors, thereby minimizing noise interference while preserving behavior diversity to highlight semantic variations in feedback across distinct scenarios. Specifically, the model begins by constructing behavior matrices that models separate behaviors, along with an interaction matrix offering a broad overview of user behaviors. It employs graph neural networks to extract higher-order semantic and structural information from input data. Concurrently, the model integrates principles of Weber-Fechner Law for the adaptive allocation of initial weights to the multiple behaviors and utilizes matrix factorization techniques for efficient behavior embedding. Extensive experiments on two real-world datasets demonstrate that cd-MBRec surpasses existing state-of-the-art models in recommendation performance, achieving notable average improvements of 4.96% in HR@10 and 7.75% in NDCG@10.
Introduction
Recommender systems are essential in online services, including e-commerce [1], social media [2], and news platforms [3]. They provide personalized suggestions for products or services using a wide range of information. Multi-behavior recommendation modeling [4, 5, 6, 7], which has become increasingly important in modern recommendation tasks, is particularly efficacious under circumstances with complex user interaction behaviors. It considers users’ explicit actions and incorporates their implicit feedback as well, significantly improving the performance of recommendation services across various domains. Additionally, multi-behavior recommendation models offer new perspectives and methods to address key issues in recommender systems, such as the cold-start problem and data sparsity [8, 9].
While the incorporation of multiple behaviors in recommender systems offers additional information, it also introduces complex contexts that are often not adequately addressed in current research. Firstly, in real-world scenarios, although multiple behaviors reflect different user intentions, some may be disturbed by abnormal user actions or personal biases during data collection. Excessively emphasizing the differences in behaviors while overlooking their similarities can lead to challenges in prediction due to noise interference [10, 11]. Secondly, user behaviors correspond to diverse intentions. Existing work fails to explicitly explain the weight allocation process of multiple behaviors across various contexts, thereby impairing the adaptability and interpretability of recommendation models.

Examples of different user intentions through multiple behavioral combinations. Although User 1 and User 2 engage differently with the watch, they exhibit similar interests, highlighting commonalities across multiple behaviors. In contrast, User 2 and User 3, both engaging with the iPhone, demonstrate distinct interests. User 3’s previous purchase suggests a lower likelihood of a repeat purchase in the near term, thus exemplifying diversity in multiple behaviors.
Different types of user behaviors reveal both commonality and diversity. For instance, Fig. 1 illustrates the behavior sequences of three users, involving various interactions such as click, favorite, add-to-cart, and purchase. In this example, both User 1 and User 2 interact with an Apple watch. User 1 follows a sequence of {click, favorite, add-to-cart}, while User 2 goes directly from click to add-to-cart. Despite their differing behavioral combinations, which could be attributed to personal habits, User 1 and User 2 exhibit similar interest in the Apple watch, highlighting the commonality within multiple behaviors. Conversely, examining User 2 and User 3 provides insight into behavior diversity. After clicking, User 2 adds an iPhone to the cart, while User 3 immediately makes a purchase. Since User 3 has already bought an iPhone, she is less likely to buy another phone in a short period, whereas User 2, who has not completed a purchase, shows sustained interest in the iPhone. Therefore, in this scenario, clicking behavior should be given more emphasis than purchasing, and it would be more effective to recommend an iPhone to User 2, illustrating the diversity in behaviors. This example underscores the importance of recognizing both commonality and diversity in multi-behavior recommendation tasks. By understanding these nuances in user-item interactions, we can uncover the potential value of these interactions, leading to more accurate and personalized recommendations.
Based on the above-mentioned analysis, we propose a Multi-Behavior Recommendation model explicitly designed to encapsulate both commonality and diversity (cd-MBRec). Specifically, this model features a behavior-commonality embedding aggregation module, adept at capturing higher-order content and structural information pertaining to user-item interactions. Each individual behavior is modeled using graph neural networks (GNNs). A crucial element of our model is the integration of the interaction matrix. This matrix not only records interactions between users and items but also effectively synthesizes the commonality among various behaviors while obtaining the embedding of users and items.
To depict the diversity of behaviors, we have developed a unique behavior-diversity weight allocation module. Grounded in the principles of the Weber-Fechner Law, this module leverages psychological insights to determine the initial weight of each behavior based on the different stimulus intensities they present to users. It also generates an adaptive scoring matrix, thus effectively handling the behavioral variations in different scenarios. By incorporating commonality interaction behaviors and adjusting the weights of various behaviors, our model can better adapt to users’ actual needs in different contexts. Our contributions can be summarized as follows:
We present cd-MBRec, an innovative recommendation model that focuses on explicitly capturing both the commonality and diversity of multiple behaviors. This method is adept at minimizing the impact of inherent noise and amplifying the variances in multi-behavioral patterns across various recommendation scenarios, ultimately enhancing overall performance. We introduce the concept of an interaction matrix, which we integrate with multiple behavior matrices to develop a behavior-commonality embedding aggregation strategy. This approach effectively extracts commonalities in multi-behavior modeling and can be applicable to other multi-behavior models, demonstrating promising results. Extensive experiments on two real-world datasets establish that our proposed cd-MBRec model surpasses state-of-the-art models in multi-behavior recommendation tasks.
The rest of this paper is structured as follows: Section 2 briefly reviews related work, Section 3 outlines our methodology, Section 4 illustrates the model’s effectiveness through experiments, and Section 5 provides our conclusion.
Multi-behavior recommendation
The field of multi-behavior recommender systems has garnered significant interest as a comprehensive alternative to single-behavior models such as DeepFM [12] and SASRec [13]. These models stand out due to their utilization of multiple user behaviors as auxiliary information, which enhances the predictive performance of the system [14, 32]. NMTR [15] adopts shared user and item embedding layers across various behavior types, capturing intricate and diverse interactions. MB-GCN [16] introduces a graph-based approach, unifying multiple user-item interaction matrices into a single graph for analysis. MB-GMN [17] incorporates multi-behavior pattern modeling within the meta-learning paradigm, enabling the discovery of behavior representations across different types. GHCF [18] jointly embeds node (user and item) representations and their relationships into a multi-relational prediction framework based on GCN [19], and it performs non-sampling optimization within a multi-task learning framework. S-MBRec [20] applies GCN to learn user and item embedding for each behavior, and it designs a supervised task to discern the importance of different behaviors. Additionally, it introduces a star-shaped contrastive learning task to capture embedding commonalities between target and auxiliary behaviors. These developments consider a variety of user behaviors, thereby offering more personalized and accurate recommendations. This paper also focuses on utilizing multiple behaviors to improve recommendation performance.
Graph-based recommendation
The burgeoning interest in graph neural networks for recommender systems has led to significant advancements in understanding and leveraging the complex relationships between users and items. Pioneering models such as GCN [19], GraphSage [21], and GAT [22] utilize the aggregation of neighbor node embedding in the spatial domain to refine the embedding of target nodes. The core idea is to enhance node representations by their neighborhoods. SR-GNN [23] and SURGE [24] have applied GNNs to session-based and sequential recommendations, respectively, achieving promising results. Further developments in graph modeling are represented by NGCF [25], which refines node embedding by adopting the bipartite user-item graph structure. LightGCN [26] argues that multi-layer non-linear graph neural networks can make model training more challenging. It removes all redundant parameters and retains only ID embedding, simplifying the model in a manner similar to matrix factorization (MF). GDE [27] addresses the issue of over-smoothing, a common problem in deep GNNs where features become overly homogenized as the number of layers increases. While these models have made strides in capturing both content and structural information at higher orders, they often lack the flexibility to fully grasp the diversity of user behaviors. To address this gap, our paper proposes a novel approach by constructing a user-item interaction matrix and assigning behavior weights based on different application scenarios. This aims to provide a more adaptable and nuanced understanding of user behaviors in recommender systems.
Methodology
We represent the set of users and items with
To initialize the weights of each behavior, we allocate behaviors in the dataset by
Model architecture

The architecture of the proposed cd-MBRec model. It consists of three components: the Behavior-Commonality Embedding Aggregation (EA) module, the Behavior-Diversity Weight Allocation (WA) module, and the Fusion module.
We propose the cd-MBRec model, illustrated in Fig. 2, to model the commonality and diversity present in multi-behavior characteristics. cd-MBRec comprises two separately trainable modules: the Behavior-Commonality Embedding Aggregation (EA) module and the Behavior-Diversity Weight Allocation (WA) module, as well as a Fusion module for integrating. The EA module utilizes graph convolution networks to capture high-order content and structured information. It also integrates behavior matrices and interaction matrices for model training, enabling the model to collect commonality of various behaviors. On the other hand, the WA module plays a crucial role in the model by assigning flexible weights to each behavior. It emphasizes the semantic differences inherent in behavioral feedback across different scenarios. This weight allocation process helps in learning behavior diversity on the recommendation outcomes, leading to more precise and effective recommendations. The collaboration between these two modules, EA and WA, combines commonality and diversity of multiple behaviors. Finally, the Fusion module acts as the coordinator, ensuring fine integration of the contributions from the EA and WA modules. It facilitates a harmonious collaboration, ultimately resulting in improved accuracy and effectiveness of the recommendations.
In cd-MBRec, graph convolution networks capture multiple interaction types between users and items, track interaction occurrences, and integrate these results into feature representations. For the graph view, we build a global multi-relation user-item graph
To capture high-order information with neighborhood nodes in the graph structures of each behavior, we adopt the Light Graph Convolution (LGC) approach, as implemented in LightGCN [26], to learn the embedding representations of users and items. For each layer of convolutional operation on the
Here,
Additionally, each behavior graph undergoes
The weight
To associate all behaviors and form the interaction prediction between users and items, EA performs a weighted average operation on the outputs of the
After obtaining the final outputs for users and items, the prediction value in this module is calculated as follows:
In the WA module, we optimize the initial embedding
Since different behaviors have different semantic meanings and carry varying degrees of interest information for users, we assign a weight list
To capture the interaction information between users and items in different behavioral contexts, we construct a user-item interaction matrix for each behavior, denoted as
Weber-Fechner Law [28], a fundamental principle in psychology, states that the perceived difference in stimuli is proportional to the intensity of the stimulus. Mathematically, Weber-Fechner Law indicates a linear relationship between perceived differences and the logarithm of stimulus intensity. Therefore, for the default weight matrix, we calculate the weights for each behavior based on the proportion of the total occurrences of that behavior in the dataset. The default weight for this behavior is calculated as follows:
Next, we multiply and sum each interaction matrix with its corresponding weight to generate a weighted score matrix
By performing matrix factorization on the weighted score matrix
To avoid the loss of preference diversity due to the large number of items, we amplify the results by
We assign weights to each module and then multiply and sum the weighted predictions of each module. Specifically, we introduce a participation parameter
Training the EA module
When training the model using the Mini-Batch method, the trainable parameters are divided into two parts: the initial user embedding and item embedding
By decomposing the weighted score matrix R, the WA module uses the Mean Squared Error (MSE) loss function and the Stochastic Gradient Descent (SGD) optimization algorithm to find the optimal low-dimensional latent vectors, i.e., the user embedding matrix
Complexity analysis
In the cd-MBRec model, the time complexity analysis is segmented into three principal components: 1) The EA module, responsible for capturing commonalities across user behaviors through graph convolutional networks, incurs a computational complexity of
When integrating these components, the initial total time complexity of the model is expressed as
Experiments
We conducted experiments on two real-world datasets to evaluate the performance of the cd-MBRec model. Our goal was to answer the following questions:
RQ1: How does cd-MBRec perform compared to the state-of-the-art baselines? RQ2: How do the sub-modules in cd-MBRec affect its recommendation performance? RQ3: How do different configurations of key hyperparameters affect the performance of cd-MBRec? RQ4: Is the behavior-commonality embedding aggregation strategy in the EA module generalizable? RQ5: Is the Weber-Fechner law efficient in initializing weights in the WA module?
Datasets. We apply two real-world datasets to evaluate the model performance objectively, known as UserBehavior1 and IJCAI.2
UserBehavior is a dataset from Taobao, one of the most popular e-commerce platforms in China. It contains four types of user-item relationships: page-view, favorite, add-to-cart, and purchase, each accompanied by a corresponding interaction timestamp.
IJCAI consists of users’ shopping logs on the Tmall platform for the six months leading up to and including Double Eleven day (November 11th). To avoid the over-large scale of input data, we processed only the data from November 1st to November 11th. This dataset includes four types of user-item relationships: click, add-to-favorite, add-to-cart, and purchase, along with the corresponding event dates.
For consistency and convenience, we unified the four types of behaviors across both datasets. We used pv for page-view and click, fav for favorite and add-to-favorite, cart for add-to-cart, and purchase for buy and purchase. A detailed description of the two datasets is given in Table 1.
Statistics of the preprocessed datasets
Statistics of the preprocessed datasets
Note that we did not remove dependencies between behaviors in the datasets, meaning purchase behaviors might appear earlier than pv, fav or cart behaviors. This is because real-world e-commerce data is segmented by time periods, and all behaviors within a given period can provide valuable information for recommendations. We considered the users’ purchase behaviors as targets since they are a good indicator of total sales on online retail websites.
Baselines. To examine the performance of our cd-MBRec model, we compared it with several popular models in recommender systems, including:
MF [29]: The matrix factorization model is one of the most classic recommendation models. NGCF [25]: This model utilizes a multi-layer graph neural network to stack and propagate embeddings on the user-item bipartite graph, enabling higher-order information aggregation. LightGCN [26]: This model discards weights and non-linear activation functions from NGCF, using a weighted summation approach that demonstrates high efficiency and accuracy. MB-GMN [17]: The model employs graph meta-networks to address the multi-behavior recommendation problem. It uses meta-networks to learn relationships between users and items and utilizes attention mechanisms to capture behavior correlations. MB-NGCF: We enhance NGCF [25] to accommodate multiple behaviors present in datasets, leading to an advanced variant named MB-NGCF. MB-LightGCN: We modify LightGCN [26] to consider multiple behaviors in datasets, resulting in an advanced variant called MB-LightGCN.
Among them, MF, NGCF, and LightGCN are designed for single-behavior recommendations, while MB-NGCF, MB-LightGCN, and MB-GMN are tailored for multi-behavior recommendations.
In multi-behavior recommendation research, numerous publicly available datasets exhibit distinct characteristics, influencing the choice of datasets across various studies employing different methodologies. Recent works such as [30, 31] utilized the IJCAI dataset for their experiments. However, they did not incorporate the UserBehavior dataset, leading to our decision to exclude these studies from our analysis.
Evaluation metrics. We used two metrics widely adopted in top-N recommendation tasks: Hit Rate (HR@N) and Normalized Discounted Cumulative Gain (NDCG@N), to evaluate the performance of our proposed cd-MBRec model and the baseline models.
Implementation details. In the cd-MBRec model, we trained the EA module and the WA module separately, using a fusion strategy for the final prediction. The EA module uses the Bayesian Personalized Ranking (BPR) loss function and the Adam optimizer, while the WA module applies the Mean Squared Error (MSE) loss function and Stochastic Gradient Descent (SGD) for optimization.
We varied the involving weight from 0 to 0.9 and selected the embedding dimension from 4, 16, 32, …, 1024. The parameters are set based on empirical knowledge, and their selection for optimal model performance is guided by experimental results. The cd-MBRec model uses the leave-one-out strategy to split the training and test sets. Specifically, the first item purchased by each user in the subsequent time period was kept as the test dataset, while all previous items were used for training. We randomly selected 99 items that were not purchased to serve as negatives samples.
We evaluated model performance in predicting users’ next purchasing items on the UserBehavior and IJCAI datasets. Tables 2 and 3 show the results of top-N item recommendation for different methods. The best and second-best results are highlighted in bold and underlined, respectively. Single-behavior, multi-behavior baselines, and our model are separated by horizontal lines, with improvement rates calculated at the bottom of the tables.
Performance comparison of different models on the UserBehavior dataset
Performance comparison of different models on the UserBehavior dataset
Performance comparison of different models on the IJCAI dataset
We observed that most multi-behavior models (cd-MBRec, MB-GMN, MB-NGCF and MB-LightGCN) surpass models that rely on single behavior for recommendations (MF, LightGCN and NGCF), proving the necessity of utilizing different types of behaviors to get more accurate performance.
Our cd-MBRec stably outperforms baselines (average improvements of 4.96% in HR@10 and 7.75% in NDCG@10), demonstrating its superiority in multi-behavior recommendation tasks. By taking interaction matrices into account and assigning independent weights to each behavior, cd-MBRec effectively captures the commonality and diversity between multiple user behaviors. Additionally, the utilization of graph neural networks to optimize user and item embedding enables the identification of potential correlations between behaviors, leading to improved prediction of user interests in recommender systems.
We conducted experiments of sub-modules in the cd-MBRec framework by manually controlling the involvement parameter to 1 or 0. In cd-MBRec, the involvement parameters of EA and WA are dynamically adjusted based on specific requirements. Table 4 shows the recommendation performance of each module on the two datasets, with the following configurations.
Ablation study of the cd-MBRec model
Ablation study of the cd-MBRec model
w/o EA: The involvement parameter of the EA module is set to 0. w/o WA: The involvement parameter of the WA module is set to 0.
The results validate the effectiveness of each sub-module within cd-MBRec. By integrating the EA and WA modules, the cd-MBRec framework outperforms each module individually across both datasets. This demonstrates the model’s capability to achieve a fine balance in capturing the commonalities and diversities of user interactions, thereby optimizing recommendation performance.
Furthermore, we also found that the EA module outperforms the WA module since the model works worse without the former. This could be attributed to the ability of the EA module to more efficiently capture the high-order and low-order interaction information between users and items using multi-behavior graph neural networks.

Performance of cd-MBRec with different weights assigned to the WA module.
We explored various configurations of key parameters in the cd-MBRec model, including the involving weight of the WA and EA modules and the embedding dimension in the WA module. Our objective is to comprehensively evaluate the performance of the proposed cd-MBRec model under different hyperparameter settings and uncover the internal relations between these settings and the results.
The impact of involving weight
Our findings underscore the importance of appropriately adjusting the involving weight in different application scenarios to achieve an optimal balance between the EA and WA modules, thus optimizing recommendation performance.
In the UserBehavior dataset for single-item recommendation (
In the IJCAI dataset, we observed a decreasing trend in cd-MBRec’s recommendation performance as the involving weight

Performance of cd-MBRec with different dimensions of the hidden state.
The impact of embedding dimension
On both datasets, we observed improved results as the embedding dimension
However, it is important to note that a larger embedding dimension does not necessarily lead to significant improvements in recommendation performance when
To investigate the generality of the multi-behavior commonality extracting strategy employed in the EA module, we conducted comparative experiments on the UserBehavior and IJCAI datasets. LightGCN and MB-GMN were used to model behaviors between users and items. The experimental data are presented in Table 5.
Impact of the commonality extracting strategy on the performance of LightGCN and MB-GMN
Impact of the commonality extracting strategy on the performance of LightGCN and MB-GMN
Single-behavior: This approach focuses on a single specific behavior, with purchase as the target behavior. Since MB-GMN is designed for multi-behavior recommendations, its performance metrics are not applicable in the single-behavior context. Multi-behavior: User’s inputs are derived from the combination of multiple behaviors. LightGCN is adapted into MB-LightGCN, as detailed in Tables 2 and 3. Multi-behavior

Comparison of different weight allocation strategies.
The results indicate that considering multiple behaviors together yields better recommendation performance compared to focusing on a single behavior alone. Our integration strategy, which includes the interaction matrix, explicitly accounts for behavior commonality and outperforms the multi-behavior fusion approach. This demonstrates its superiority in exploring user interests through various behaviors.
Due to the significant disparity in the proportions of different behaviors within the dataset, adjusted the weights in the WA module to account for this diversity, thereby mitigating the data sparsity issue. The initial weights can be set either manually or automatized using the Weber-Fechner law. In Fig. 5, we evaluate the model’s performance using three different weight allocation strategies.
Specifically, the label 1:1:1:1 indicates that all user behaviors are equally weighted. The label 2:1:1:1 means that the weight of pv is doubled (since pv is the most frequently occurring behavior in both datasets), while the weights of other behaviors remain unchanged. The Weber columns display the results of weights calculated using the Weber-Fechner law, which is the default in our model.
Increasing the weight proportion of pv behavior reveals distinct trends in the UserBehavior and IJCAI datasets, suggesting that weight distribution should be adaptively adjusted based on different situations. Weights determined by the Weber-Fechner Law achieve the best performance, demonstrating the superiority of our proposed method.
Moreover, the fixed weight allocation provided by the Weber-Fechner Law enhances cd-MBRec’s interpretability, making the model more comprehensible and manageable. This distinct feature gives cd-MBRec a significant advantage over other models.
Conclusion
In this paper, we introduce cd-MBRec, an innovative recommendation model designed to explicitly capture both the commonalities and diversities inherent in multi-behavioral user interactions. Our model employs two key strategies: the behavior-commonality embedding aggregation strategy, which effectively reduces noise from user habits, and the diversity-enhancing strategy, which emphasizes the significance of multiple behaviors in various contextual scenarios. Extensive experiments on two real-world datasets demonstrate that cd-MBRec significantly outperforms state-of-the-art methods. Future work will focus on further exploring the applicability and effectiveness of multi-behavioral modeling and denoising techniques across a broader range of applications.
Footnotes
Acknowledgments
This work is partly supported by the Shanghai Science and Technology Innovation Action Plan Project (No. 22511100700).
Compliance with Ethical Standards
Conflict of interest statement. No conflict of interest exists in the submission of this manuscript, and manuscript is approved by all authors for publication. I would like to declare on behalf of my coauthors that the work described is original research that has not been published previously, and not under consideration for publication elsewhere, in whole or in part. All the authors listed have approved the manuscript that is enclosed. And this article does not contain any studies with human participants performed by any of the authors.
