Fusing semantic and syntactic information for aspect sentiment triplet extraction

Abstract

Aspect Sentiment Triplet Extraction (ASTE) aims to extract aspect terms, sentiment polarity and opinion terms explaining the reason for the sentiment from a sentence in the form of triplets. Many existing studies model the context by graph neural networks to learn relevant information from the generated graphs. However, some sentences may have syntactic errors or lack significant grammar, which may lead to poor results on the dataset of the model. In this paper, we propose the Fusing Semantic and Syntactic Information for Aspect Sentiment Triplet Extraction (FSSI) model, which incorporates both syntactic structure and semantic relevance in the context. Specifically, we construct a syntactic graph convolutional network to obtain comprehensive syntactic structure information and a semantic graph convolutional network to obtain global semantic relevance of sentences using the self-attention mechanism. Furthermore, we concatenate the graph representations generated by the two graph convolution networks to obtain the final enhanced representation. Finally, we apply an effective inference strategy to extract triplets. Extensive experimental results on benchmark datasets show that our model outperforms state-of-the-art approaches.

Keywords

Aspect sentiment triplet extraction graph convolutional network sentiment analysis

1 Introduction

Aspect-based Sentiment Analysis (ABSA) [1, 2] is a fine-grained sentiment analysis task that aims to extract the aspect terms and their corresponding sentiment polarities in a sentence. In recent years, peng et al. [3] proposed a more relatively fine-grained task, Aspect Sentiment Triplet Extraction (ASTE) based on ABSA. ASTE focuses on extracting all triplets from the input sentence and every triplet contains an aspect term, the sentiment polarity and an opinion term. As shown in Fig. 1, the sentiment polarities of the two aspect terms “image” and “sound” are “positive” and “negative”, respectively, and the corresponding opinion terms are “great” and “poor”.

Fig. 1

An example of an ASTE task with a dependency tree.

Existing sentiment triplet extraction models mainly include two methods. Pipeline approach [3] decomposes the task into independent subtasks, involving the extraction of aspect terms, opinion terms, and the determination of sentiment polarity in two distinct steps. Such technique lacks comprehensive understanding of the task and suffers from the error propagation problem, where the errors generated in the first stage would propagate to the second stage, affecting the final overall performance. Recent end-to-end frameworks [4 –9] extract triplets at once by designing a unified tagging scheme. However, these methods do not effectively establish connections between words and linguistic features. There are various information interactions among triplets, more efforts have been devoted to graph convolutional networks (GCNs) and graph neural networks (GNNs) over dependency trees. These networks explicitly exploit the syntactic structure of a sentence to establish syntactic dependencies between words within it. Take Fig. 1 as an example, there is a nominal subject dependency between “image” and “great”, which indicates the presence of an aspect term. The two opinion terms “great” and “poor” are also related to each other and there is a connecting relationship between them, which implies that they have similar attributes. Nevertheless, previous works [8, 9] ignored the importance of modeling semantic information separately and did not effectively address the issues of incomplete extraction of multi-word aspects or opinions.

In this paper, we propose the Fusing Semantic and Syntactic Information for Aspect Sentiment Triplet Extraction (FSSI) model to solve the ASTE task and the overall framework of FSSI is shown in Fig. 2. FSSI takes into account the complementary nature of semantic and syntactic information, which is able to make full use of the relationships between words. First, we construct a syntactic graph convolutional network to obtain rich syntactic knowledge by parsing the probability matrix of dependency arcs using a dependency parser. This network has significant advantages in dealing with complex syntactic structures. By using the probability matrix to represent the dependencies between words, rich syntactic information can be captured for better understanding and parsing of sentence structures. Then, the semantic information is obtained by the semantic graph convolutional network, overcoming the shortcomings of not being able to obtain complete syntactic information due to unclear syntactic structure. Moreover, the primitive dependency tree can be regarded as a hard attention mechanism [10], when the aspect term is distant from its opinion term, the model may suffer performance degradation by failing to accurately capture syntactic information. In order to fully utilize the information in the semantic space, a semantic enhancement module is constructed by using the self-attention mechanism in the graph convolutional network. This attention matrix formed by self-attention can represent the semantic correlation between words. Finally, we concatenate the two modules of syntactic graph convolutional network and semantic graph convolutional network. Our extensive experiments on four benchmark datasets confirm that FSSI achieves predominant performance compared with existing state-of-the-art approaches.

Fig. 2

The overall architecture of our end-to-end model FSSI.

Our contributions can be summarized as follows:

1) We propose a FSSI model for the ASTE task. The model utilizes two graph convolutional networks to obtain semantic and syntactic information respectively.

2) We construct a semantic graph convolution module and a syntactic graph convolution module. An attention matrix is obtained as the adjacency matrix by self-attention mechanism to extract the most relevant information in the semantic space. At the same time, the dependency probability matrix is used to capture syntactic structure information.

3) We conduct extensive experiments and the results show FSSI significantly outperforms all state-of-the-art methods for triplet extraction.

2 Related work

Traditional sentiment analysis tasks are typically sentence-level [11, 12] or document-level [13, 14]. In contrast, Aspect-based Sentiment Analysis (ABSA) as a fine-grained sentiment analysis task aims to analyze aspect or entity oriented sentiment tendencies from unstructured text. The research process of ABSA can be divided into three phases: separate extraction, pair extraction, and triplet extraction.

The early task of ABSA is to explore three subtasks. Aspect Term Extraction (ATE) [15 –21] and Opinion Term Extraction (OTE) [22 –26] are to extract aspect terms and opinion terms in a given sentence, respectively. Aspect Sentiment Classification (ASC) [27 –33] aims to predict the sentiment polarity of a given aspect term in a sentence.

Although the performance of a single task can achieve what one would expect, the dependency among these subtasks is ignored. Therefore, with the addition of the new task and the new benchmark datasets, work began on coupling these two subtasks, namely aspect-based pair extraction. There are two main tasks in this category: Aspect Opinion Pair Extraction (AOPE) [34 –39] and Aspect Sentiment Pair Extraction (ASPE) [40 –43].

Recently, to better understand the relationships between the various subtasks, Peng et al. [3] first introduced the ASTE task and presented a two-stage pipeline model. In the first stage, they treated aspect terms and opinion terms extraction as a sequence labeling problem, where aspect terms and sentiments were co-extracted as unified labels. Then, matching aspects and opinions in the second stage to obtain the final triplets. To further explore this task, [4 –9] employ an end-to-end approach to address ASTE task. For instance, Xu et al. [5] proposed the end-to-end model with a new location-aware labeling scheme that jointly extracts the triplets. Wu et al. [4] exploited the Grid Tagging Scheme (GTS) to process ASTE tasks. Yan et al. [7] converted the ASTE task into a generative formulation. Chen et al. [8] addressed the problem via Graph Convolutional Network (GCN). However, these methods usually ignore the effective integration of syntactic structure and semantic relevance, and fail to improve model accuracy when aspect terms or opinion terms consist of multiple words.

3 Model framework

We design an effective framework to accomplish triplets extraction using an end-to-end fashion. The architecture of the whole model is shown in Fig. 2. In this section, we first define the ASTE task, describe the tagging schema and then decode the triplets.

3.1 Task definition

Let X = {w₁, w₂, … w_n} denote a sentence of n tokens, the model aims to output a set of triplets T = ${(a, o, s)_{m}}_{m = 1}^{| T |}$ , where |T| denotes the triplet set in X. (a, o, s) represents a triplet in a sentence, a, o, and s separately indicate aspect term, opinion term, and corresponding sentiment polarity. Furthermore, s belongs to the sentiment label set S = {POS, NEU, NEG}, which consists of three sentiment polarities: positive, neutral, and negative.

3.2 Relation definition and table filling

We use 10 labels G = {B-A, I-A, B-O, I-O, A, O, NEG, NEU, POS, N} proposed by Chen et al. [8] to denote the relationship between any two words in the sentence. B-A and I-A denote the beginning and inside of the extracted aspect terms, respectively. B-O and I-O represent the beginning and inside of the extracted opinion terms, respectively. A and O are used to determine whether the word pair consisting of two words belongs to the same aspect or opinion term. In addition, the tags POS, NEU, and NEG are used to determine the sentiment polarity of the aspect-opinion pair. Thus, a table of relation can be constructed for each labeled sentence using table filling [44, 45]. In Fig. 3, we show word pairs and their relations of an example sentence. Here, each cell corresponds to a word pair with one relation.

Fig. 3

An example of EMC-GCN labeling.

3.3 Triplet decoding

We adopt the decoding algorithm designed by Chen et al. [8]. First, aspect terms and opinion terms are extracted using the predicted relationships of all word pairs (w_i, w_i) on the main diagonal. Second, judging whether the extracted aspect terms and opinion terms match. In particular, for the aspect term a and the opinion term o that have been extracted, we compute the predictive relations for all word pairs (w_i, w_j), where w_i ∈ a and w_j ∈ o. The aspect and opinion term are considered to be paired if there is any sentiment relation in the predicted relationship, otherwise they are not paired. Finally, to determine the sentiment polarity of aspect-opinion pair, the most frequent sentiment label s ∈ S is considered to be the sentiment polarity of the triplet (a, o, s).

3.4 Semantic and syntactic enhanced module

The aim of the ASTE task is to extract multiple elements from a sentence, so it is necessary to efficiently distinguish the attributes of words and to capture the relationships between them. In order to obtain multifaceted features, we use two GCNs to obtain syntactic dependencies and semantic correlations between words.

3.4.1 Input and encoding layers

For a given sentence X = {w₁, w₂, … w_n} of length n, we utilize BERT [46] to obtain the contextualized word representations. Then, the sentence is represented as H = {h₁, h₂, … h_n}.

3.4.2 Syntactic graph convolution module

We construct a graph convolutional network to obtain comprehensive syntactic information in the sentences, using syntactic information as input, as shown in Fig. 4. First, we utilize the LAL-Parser [47] to obtain the dependency tree of the input sentence. Then, the dependency probability matrix is obtained from the dependency tree, which can alleviate the dependency parsing errors.

Fig. 4

Architecture of Syntactic Graph Convolution Module.

Given a graph with n nodes, we use the adjacency matrix A^syn ∈ R^n×n to represent the graph. The element $A_{ij}^{syn}$ in A^syn indicates whether nodes w_i and w_j are connected. Specifically, if the i-th node is connected to the j-th node, $A_{ij}^{syn}$ = 1, otherwise $A_{ij}^{syn}$ = 0. Furthermore, A^syn consisting of 0 and 1 can be considered as the final discrete output of the dependency parser. For the i-th node in the l-th layer, the hidden state representation is denoted as $h_{i}^{l}$ , which can be formulated as:

$h_{i}^{l} = σ (\sum_{j = 1}^{n} A_{ij}^{syn} W^{l} h_{j}^{l - 1} + b^{l})$ (1) where W^l is a weight matrix, b^l is a bias term, and σ is an activation function (e.g., ReLU).

Utilizing the syntactic encoding of the adjacency matrix A^syn, and the syntactic graph convolutional module adopts the hidden state vector H in BERT as the initial node representation in the syntactic graph. Then formula (1) is used to obtain the syntactic graph representation H^syn = ${h_{1}^{syn}, h_{2}^{syn}, \dots h_{n}^{syn}}$ .

3.4.3 Semantic graph convolution module

Unlike syntactic graph convolutional representations, semantic graph convolutional representations obtain an attention matrix A^sem as the adjacency matrix by self-attention mechanism. The self-attention mechanism can capture the semantic relevance of each word in a sentence, which helps the model to correctly understand the individual words in the sentence, as well as learn the relationship between words. Capturing contextual information by the self-attention mechanism is more flexible than syntactic structure. Moreover, the module can effectively handle sentences that are insensitive to syntactic information.

The attention score for each pair of elements was calculated in parallel via self-attention. In FSSI, we compute the attention score matrix A^sem ∈ R^n×n by the self-attention layer. Then, using A^sem as the adjacency matrix of our semantic graph convolution module, which can be formulated as:

$A^{sem} = softmax (\frac{{QW}^{Q} \times ({KW}^{K})^{T}}{\sqrt{d}})$ (2) where matrices Q and K are both equal to the graph representation of the previous layer of the module. W^Q and W^K are the learnable weight matrices. d is the dimension of the input node features. Similar to the syntactic graph convolution module, the semantic graph convolution module obtains graph representation H^sem.

3.4.4 Prediction laywer

In order to efficiently capture the relevant features between the two modules, we connect the semantic graph representation with the syntactic graph representation, i.e., $H_{i} = [H_{i}^{syn}; H_{i}^{sem}]$ , then we concatenate the augmented representations of w_i and w_j to obtain the representation of word pair (w_i, w_j), i.e., s_ij = [H_i ; H_j], where [;] denotes the vector concatenation operation. Finally, the output representation s_ij is input into a linear layer and then using a softmax function to generate a label probability distribution p_ij, i.e.,

$p_{ij} = softmax (W_{p} s_{ij} + b_{p})$ (3) where W_p and b_p are the learnable weight and bias.

3.5 Loss function

In order to fully learn the information between the two modules, we impose a constraint, i.e.,

$L_{d} = \frac{1}{∥ A^{sem} - A^{syn} ∥_{F}}$ (4)

The total objective function is as follows:

$L = L_{p} + λ_{1} L_{d} + λ_{2} ∥ Θ ∥_{2}$ (5) where λ₁ and λ₂ are regularization coefficients and Θ denotes all trainable model parameters. The ASTE task employs a standard cross entropy loss, i.e.,

$L_{p} = - \sum_{i}^{n} \sum_{j}^{k} \sum_{c \in C} 핀 (y_{ij} = c) log (p_{i, j | c})$ (6) where $핀 (\cdot)$ is the indicator function, and y_ij denotes the ground truth of word pair (w_i, w_j).

4 Experiments

4.1 Datasets

We evaluate our method on four datasets compiled by Wu et al. [4]. Table 1 lists the statistics of these datasets. 14res, 15res and 16res belong to the restaurant domain, while 14lap belongs to the laptop domain. Each sentence has been labeled with a series of aspect tags and opinion tags, as well as the sentiment polarity of the corresponding aspects. These datasets were originally derived from the SemEval Challenges (Pontiki et al., 2014 [2], 2015 [48], 2016 [49]).

Table 1
Statistics of the Datasets

14res 14lap 15res 16res

Database #Sentence #Triplet #Sentence #Triplet #Sentence #Triplet #Sentence #Triplet

train 1259 2356 899 1452 603 1038 863 1421

dev 315 580 225 383 151 239 216 348

test 493 1008 332 547 325 493 328 525

4.2 Baselines

We compare the performance of FSSI to state-of-the-art baselines. These models are simply categorized into two groups.

1) Pipeline methods

Peng-unified-R+PD [3]. The first stage (Peng-unified-R) treats aspect terms and opinion terms extraction as a sequence labeling problem, where aspect terms and sentiment polarities are co-extracted as unified labels. In the second stage, it applies an MLP-based classifier (PD) to determine whether each triplet is valid or not.

Li-unified-R+PD [3]. In the first stage, it jointly identifies aspect terms and their sentiments using Li-unified [41]. Meanwhile, it predicts opinion terms with an original OE component. In the second stage, it also uses the same classifier (PD) to obtain all the triplets.

Peng-unified-R+IOG [4]. It uses the model Peng-unified-R in the first stage to extract aspect-sentiment pairs, and then generates triplets via IOG [24].

IMN+IOG [4]. It uses the first stage of IMN model [43] to extract aspect-sentiment pairs, and then obtains triplets via IOG [24].

2) End-to-end methods

S³E² Chen et al. [9] designed a Graph-Sequence duel representation and modeling paradigm for the task of ASTE.

Grid. Wu et al. [4] proposed the Grid Tagging Scheme (GTS) in an end to end method to address ASTE task. Their model uses three different encoders, including GTS-CNN, GTS-LSTM and GTS-BERT.

EMC-GCN. Chen et al. [8] proposed a multi-channel graph convolutional network to solve the ASTE task by exploiting the relationships between words.

4.3 Implementation details

We use the BERT-base-uncased version as our sentence encoder. The learning rate is 2 × 10^-5 for BERT fine-tuning. The learning rate for the other trainable parameters is 10^-5, and the dropout rate is 0.5. We use AdamW optimizer [50] as FSSI optimizer. The hidden state dimensions for BERT and GCN are set to 768 and 300. The dropout rate of the GCN is set to 0.1, and the number of GCN layers is 2. The FSSI model is trained in 100 epochs with a batch size of 8. Additionally, we set the hyperparameters λ₁ and λ₂ to 0.2 and 10^-4 respectively. All sentences are parsed by Stanza [51].

Following previous work, we report experimental results based on precision (P), recall (R), and F1 scores. Note that the F1 score measures the performance of the triplets, which means that a triplet is correct only if its aspect span, corresponding sentiment and opinion span are all correct.

4.4 Main results

Table 2 gives the main results of the final triplet extraction. The end-to-end approaches achieve more significant improvements than the pipeline approaches because they are able to better integrate the information between these subtasks. Compared to Grid-BERT, FSSI model increases the absolute F1 score by 0.76% and 2.98% on 14res and 14lap, respectively, and significantly outperforms it by 2.41% and 3.14% on 15res and 16res, respectively. And FSSI increases the absolute F1 score by 0.70% and 2.54% on 15res and 16res, respectively, compared to EMC-GCN. These results show that our model effectively utilizes syntactic knowledge and semantic information and is able to accurately match datasets containing formal, informal, or complex comments.

Table 2
Experimental results of triplet extraction. The best results are shown in bold. The mark “-” indicates that the original code of the IMN method does not contain the resources needed to run on the dataset 16res

14lap 15res 16res

Model P R F1 P R F1 P R F1 P R F1

Li-unified-R+PD 41.44 68.79 51.68 42.25 42.78 42.47 43.34 50.73 46.69 38.19 53.47 44.51

Peng-unified-R+PD 44.18 62.99 51.89 40.40 47.24 43.50 40.97 54.68 46.79 46.76 62.97 53.62

Peng-unified-R+IOG 58.89 60.41 59.64 48.62 45.52 47.02 51.70 46.04 48.71 59.25 58.09 58.67

IMN+IOG 59.57 63.88 61.65 49.21 46.23 47.68 55.24 52.33 53.75 – – –

Grid-CNN 70.79 61.71 65.94 55.93 47.52 51.38 60.09 53.57 56.64 62.63 66.98 64.73

Grid-BiLSTM 67.28 61.91 64.49 59.42 45.13 51.30 63.26 50.71 56.29 66.07 65.05 65.56

S³E² 69.08 64.55 66.74 59.43 46.23 52.01 61.06 56.44 58.66 71.08 63.13 66.87

Grid-BERT 70.92 69.49 70.20 57.52 51.92 54.58 59.29 58.07 58.67 68.58 66.60 67.58

EMC-GCN 71.85 72.12 71.98 61.46 55.56 58.32 59.89 61.05 60.38 65.08 71.66 68.18

Our FSSI 70.05 71.88 70.96 60.98 54.50 57.56 61.52 60.65 61.08 69.23 72.28 70.72

4.5 Model analysis

4.5.1 Ablation experiments

To further investigate the role of different modules in the model, we conduct extensive ablation studies on the ASTE task. The results are shown in Table 3. FSSI represents our full model, and we evaluate the role of each module by using two model variants. Syn means removing the syntactic graph convolution module and Sem means that we remove the semantic graph convolution module. The experimental results show that our two graph convolutional networks are able to accurately capture the syntactic and semantic information in a sentence. In summary, each module in the FSSI contributes to the overall performance of the ASTE task.

Table 3
Results of ablation experiments for the ASTE task

model 14res 14lap 15res 16res

F1 F1 F1 F1

FSSI 70.96 57.56 61.08 70.72

Syn 68.76 55.97 58.59 67.86

Sem 69.27 53.49 58.78 68.52

4.5.2 Impact of graph network layer number

We evaluate the effects of different numbers of graph network layers on model performance on four datasets, as shown in Fig. 5. The experiments show that graph convolutional networks usually get the best performance with two layers, and deeper layers do not give better results for the model. If the number of layers in the graph network is too small, the node representation cannot propagate far. And when there are too many layers, the model becomes unstable.

Fig. 5

Effect of the number of GCN layers.

4.5.3 Case study

Tables 4 and 5 show some typical comment sentences analyzed using the three models. The first column shows four representative sentences, the second column shows the labeled true triplets, and the third column shows the outputs of the different models.

In the first example, all models extract the triplets accurately because the sentences are relatively simple and do not have complex sequences. In contrast, in the second and third examples, aspect terms or opinion terms consist of multiple words with one-to-many or many-to-one relationships. At this point, only model FSSI succeeds in extracting accurate triplets. Although model S³E² can extract “battery life” and “no issues”, it fails to accurately predict sentiment polarity. Model GTS has incomplete triplets when processing aspect terms or opinion terms composed of multiple words, indicating that it lacks sufficient contextual semantic and syntactic interactions. In the third example, the aspect term “touchscreen functions” is far away from the opinion term “not enjoy”, which prevents both model GTS and S³E² from extracting the triplet accurately. Our model FSSI can handle this complex and informal sentence by fully considering the complementarity of syntactic knowledge and semantic information. The fourth example involves a complex opinion term “left much to be desired”, which consists of multiple words and is an implicit opinion. All three models fail to accurately extract triplets. Overall, FSSI has the highest accuracy. However, our model still needs to be improved in terms of implicit extraction. Therefore, future work will focus on refining the extraction of multi-word implicit triplets.

Table 4
Case study of S³E²

Sentence Ground truth S³E²

The image is great but the sound is poor. [image, great, POS], [sound, poor, NEG] [image, great, POS], [sound, poor, NEG]

The battery life seems to be very good, and have had no issues with it. [battery life, good, POS], [battery life, no issues, POS] [battery life, good, POS], [battery life, no issues, NEU] ×

Did not enjoy the new Windows 10 and touchscreen functions. [Windows 10, not enjoy, NEG], [touchscreen functions, not enjoy, NEG] [Windows 10, not enjoy, POS] ×, [touchscreen functions, not enjoy, NEU] ×

The battery life, before the battery completely died of course, left much to be desired. [battery life, left much to be desired, NEG] [battery life, desired, NEU] ×

Sentence	Ground truth	S³E²
The image is great but the sound is poor.	[image, great, POS], [sound, poor, NEG]	[image, great, POS], [sound, poor, NEG]
The battery life seems to be very good, and have had no issues with it.	[battery life, good, POS], [battery life, no issues, POS]	[battery life, good, POS], [battery life, no issues, NEU] ×
Did not enjoy the new Windows 10 and touchscreen functions.	[Windows 10, not enjoy, NEG], [touchscreen functions, not enjoy, NEG]	[Windows 10, not enjoy, POS] ×, [touchscreen functions, not enjoy, NEU] ×
The battery life, before the battery completely died of course, left much to be desired.	[battery life, left much to be desired, NEG]	[battery life, desired, NEU] ×

Table 5

Case studies of GTS and FSSI

Sentence	GTS	FSSI
The image is great but the sound is poor.	[image, great, POS], [sound, poor, NEG]	[image, great, POS], [sound, poor, NEG]
The battery life seems to be very good, and have had no issues with it.	[battery life, good, POS], [battery life, issues, POS] ×	[battery life, good, POS], [battery life, no issues, POS]
Did not enjoy the new Windows 10 and touchscreen functions.	[Windows 10, enjoy, POS]×, [touchscreen functions, enjoy, POS] ×	[Windows 10, not enjoy, NEG], [touchscreen functions, not enjoy, NEG]
The battery life, before the battery completely died of course, left much to be desired.	[battery life, desired, POS] ×	[battery life, be desired, POS] ×

5 Conclusion

In this paper, we propose an FSSI model for the ASTE task. In order to effectively overcome the lack of syntactic structure ambiguity that leads to parsing errors, we use two GCN modules to integrate syntactic knowledge and semantic information. Furthermore, we utilize semantic graph convolution module to obtain semantic information and effectively capture the internal connections among the triplets in a sentence. The experimental results show that FSSI achieves strong performance and is able to capture the relationship between word pairs significantly. Additionally, the ablation studies validate the role of each element in the model. In our future work, we will further utilize the semantic relations between contexts to improve the accuracy of one-to-many or many-to-one relations between the aspect terms and opinion terms as well as multi-word implicit extraction in sentences.

References

Vinodhini

and Chandrasekaran

R.M.

, Sentiment analysis and opinionmining: a survey,282–, International Journal (2012), 282–292.

Maria Pontiki , Haris Papageorgiou , Dimitrios Galanis , Ion Androutsopoulos , John Pavlopoulos , Suresh Manandhar , Semeval-2014 task 4: Aspect based sentiment analysis, SemEval 2014(2014), 27.

Haiyun Peng , Lu Xu , Lidong Bing , Fei Huang , Wei Lu , Luo Si , Knowing what, howand why:Anear complete solution for aspect-based sentiment analysis, in: Proceedings of the AAAI Conference on Artificial Intelligence 34 (2020), 8600–8607.

ZhenWu , Chengcan Ying , Fei Zhao , Zhifang Fan , Rui Xia ,Grid tagging scheme for aspect-oriented fine-grained opinion extraction, in: Proceedings of Findings of the Association for Computational Linguistics: EMNLP (2020), 2576–2585.

Lu Xu , Hao Li , Wei Lu , Lidong Bing ,Position-aware tagging for aspect sentiment triplet extraction, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing,EMNLP(2020), 2339–2349.

Chen Zhang , Qiuchi Li , Dawei Song , Benyou Wang ,A multi-task learning framework for opinion triplet extraction, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings (2020), 819–828.

Hang Yan , Junqi Dai , Tuo Ji , Xipeng Qiu , Zheng Zhang , A unified generative framework for aspect-based sentiment analysis, in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 2416–2429, 2021.

Hao Chen , Zepeng Zhai , Fangxiang Feng , Ruifan Li , XiaojieWang , Enhanced multi-channel graph convolutional network for aspect sentiment triplet extraction, in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2974–2985, 2022.

Zhexue Chen , Hong Huang , Bang Liu , Xuanhua Shi , Hai Jin , Semantic and syntactic enhanced aspect sentiment triplet extraction, in: Findings of the Association for Computational Linguistics: ACL-IJCNLP (2021), 1474–1483.

10.

Kelvin Xu , Jimmy Ba , Ryan Kiros , Kyunghyun Cho , Aaron Courville , . Ruslan Salakhudinov , Rich Zemel , Yoshua Bengio ,Show, attend and tell: Neural image caption generation with visual attention, in: International conference on machine learning, pages 2048–2057, 2015.

11.

Bishan Yang , Claire Cardie ,Context-aware learning for sentence-level sentiment analysis with posterior regularization, in: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 325–335, 2014.

12.

Aliaksei Severyn , Alessandro Moschitti ,Twitter sentiment analysis with deep convolutional neural networks, in: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, pages 959–962, 2015.

13.

Zi-Yi Dou ,Capturing user and product information for document level sentiment analysis with deep memory network, in: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 521–526, 2017.

14.

Chenyang Lyu , Jennifer Foster , Yvette Graham ,Improving document-level sentiment analysis with user and product context, in: Proceedings of the 28th International Conference on Computational Linguistics, pages 6724–6729, 2020.

15.

Minqing Hu , Bing Liu ,Mining and summarizing customer reviews, in: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 168–177, 2004.

16.

Yichun Yin , FuruWei , Li Dong , Kaimeng Xu , Ming Zhang , Ming Zhou , Unsupervised word and dependency path embeddings for aspect term extraction, in: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pages 2979–2985, 2016.

17.

Hu Xu , Bing Liu , Lei Shu , Yu Philip

,Double embeddings and cnn-based sequence labeling for aspect extraction,in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 592–598, 2018.

18.

Dehong Ma , Sujian Li , Fangzhao Wu , Xing Xie , Houfeng Wang , Exploring sequence-to-sequence learning in aspect term extraction, in: Proceedings of the 57th annual meeting of the association for computational linguistics, pages 3538–3547, 2019.

19.

Zhuang Chen , Tieyun Qian ,Enhancing aspect term extraction with soft prototypes, in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2107–2117, 2020.

20.

ZhenkaiWei , Yu Hong , Bowei Zou , Meng Cheng , Jianmin Yao , Don’t eclipse your arts due to small discrepancies: Boundary repositioning with a pointer network for aspect extraction, in: Proceedings of the 58th annual meeting of the association for computational linguistics, pages 3678–3684, 2020.

21.

Qianlong Wang , Zhiyuan Wen , Qin Zhao , Min Yang , Ruifeng Xu , Progressive self-training with discriminator for aspect term extraction, in: Proceedings of the 2021 conference on empirical methods in natural language processing, pages 257–268, 2021.

22.

Bishan Yang , Claire Cardie ,Extracting opinion expressions with semi-markov conditional random fields, in: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 1335–1345, 2012.

23.

Bishan Yang , Claire Cardie ,Joint inference for finegrained opinion extraction, in: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1640–1649, 2013.

24.

Zhifang Fan , Zhen Wu , Xinyu Dai , Shujian Huang , Jiajun Chen , Target-oriented opinion words extraction with targetfused neural sequence labeling, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2509–2518, 2019.

25.

ZhenWu , Fei Zhao , Xin-Yu Dai , Shujian Huang , Jiajun Chen , Latent opinions transfer network for target-oriented opinion words extraction, in: Proceedings of the AAAI Conference onArtificial Intelligence 34 (2020), 9298–9305.

26.

Samuel Mensah , Kai Sun , Nikolaos Aletras ,An empirical study on leveraging position embeddings for target-oriented opinion words extraction, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 9174–9179, 2021.

27.

Duyu Tang , Bing Qin , Bing Qin ,Aspect level sentiment classification with deep memory network, in: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 214–224, 2016.

28.

Dehong Ma , Sujian Li , Xiaodong Zhang , Houfeng Wang ,Interactive attention networks for aspect-level sentiment classification, in: Proceedings of the 26th International Joint Conference on Artificial Intelligence, IJCAI’17, pages 4068–4074, 2017.

29.

Xin Li , Lidong Bing , Wai Lam , Bei Shi , Transformation networks for target-oriented sentiment classification, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 946–956, 2018.

30.

Chen Zhang , Qiuchi Li , Dawei Song ,Aspect-based sentiment classification with aspect-specific graph convolutional networks, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4568–4578, 2019.

31.

KaiWang , Weizhou Shen , Yunyi Yang , Xiaojun Quan , Rui Wang , Relational graph attention network for aspectbased sentiment analysis, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3229–3238, 2020.

32.

Ruifan Li , Hao Chen , Fangxiang Feng , Zhanyu Ma , Xiaojie Wang , Hovy

, Dual graph convolutional networks for aspect-based sentiment analysis, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pages 6319–6329, 2021.

33.

Chenhua Chen , Zhiyang Teng , Zhongqing Wang , Yue Zhang , Discrete opinion tree induction for aspect-based sentiment analysis, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2051–2064, 2022.

34.

WenyaWang , Sinno Jialin Pan , Daniel Dahlmeier , Xiaokui Xiao ,Recursive neural conditional random fields for aspectbased sentiment analysis, in: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 616–626, 2016.

35.

WenyaWang , Sinno Jialin Pan , Daniel Dahlmeier , Xiaokui Xiao ,Coupled multi-layer attentions for coextraction of aspect and opinion terms, in: Proceedings of the AAAI conference on artificial intelligence, volume 31, 2017.

36.

Peng Chen , Zhongqian Sun , Lidong Bing , Wei Yang ,Recurrent attention network on memory for aspect sentiment analysis, in: Proceedings of the 2017 conference on empirical methods in natural language processing, pages 452–461, 2017.

37.

Hongliang Dai , Yangqiu Song ,Neural aspect and opinion term extraction with mined rules as weak supervision, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5268–5277, 2019.

38.

Wenya Wang , Sinno Jialin Pan ,Transferable interactive memory network for domain adaptation in fine-grained opinion extraction, in: Proceedings of the AAAI Conference on ArtificialIntelligence 33 (2019), 7192–7199.

39.

Shaowei Chen , Jie Liu , Yu Wang , Wenzheng Zhang , Ziming Chi , Synchronous double-channel recurrent network for aspect-opinion pair extraction, in Proceedings of the 58th annual meeting of the association for computational linguistics, pages 6515–6524, 2020.

40.

Dehong Ma , Sujian Li , Houfeng Wang ,Joint learning for targeted sentiment analysis, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4737–4742, 2018.

41.

Xin Li , Lidong Bing , Piji Li , Wai Lam ,A unified model foropinion target extraction and target sentiment prediction, in: Proceedings of the AAAI conference on artificial intelligence 33 (2019), 6714–6721.

42.

Xin Li , Lidong Bing , Wenxuan Zhang , Wai Lam ,Exploiting bert for end-to-end aspect-based sentiment analysis, in: Proceedings of the 5thWorkshop on Noisy User-generated Text (W-NUT 2019), pages 34–41, 2019.

43.

Ruidan He , Wee Sun Lee , Hwee Tou Ng , Daniel Dahlmeier ,An interactive multi-task learning network for endto- end aspect-based sentiment analysis, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 504–515, 2019.

44.

Makoto Miwa , Yutaka Sasaki , Modeling joint entity and relation extraction with table representation, in: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1858–1869, 2014.

45.

Pankaj Gupta , Hinrich Schütze , Bernt Andrassy ,Table filling multi-task recurrent neural network for joint entity and relation extraction, in: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 2537–2547, 2016.

46.

Jacob Devlin Ming-Wei Chang Kenton , Lee Kristina Toutanova , Bert:Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of naacL-HLT 1 (2019), 2

47.

Khalil Mrini , Franck Dernoncourt , Trung Bui , Walter Chang , Ndapa Nakashole , Rethinking self-attention: An interpretable self-attentive encoder-decoder parser. arXiv preprint arXiv:1911.03875, 2019.

48.

Maria Pontiki , Dimitrios Galanis , Harris Papageorgiou , Suresh Manandhar , Ion Androutsopoulos , Semeval- 2015 task 12: Aspect based sentiment analysis, in: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), pages 486–495, 2015.

49.

Maria Pontiki , Dimitris Galanis , Haris Papageorgiou , Ion Androutsopoulos , Suresh Manandhar , Mohammed ALSmadi , Mahmoud Al-Ayyoub , Yanyan Zhao , Bing Qin , Orphée De Clercq , et al.Semeval-2016 task 5: Aspect based sentiment analysis, in: ProWorkshop on Semantic Evaluation (SemEval- 2016), pages 19–30. Association for Computational Linguistics, 2016.

50.

Ilya Loshchilov , Frank Hutter , Fixing weight decay regularization in adam. DOI:10.48550/arXiv.1711.05101,2018.

51.

Peng Qi , Yuhao Zhang , Yuhui Zhang , Jason Bolton , Christopher

, Manning, Stanza:Apython natural language processing toolkit for many human languages, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 101–108, 2020.

14res		14lap		15res		16res
Database	#Sentence	#Triplet	#Sentence	#Triplet	#Sentence	#Triplet	#Sentence	#Triplet
train	1259	2356	899	1452	603	1038	863	1421
dev	315	580	225	383	151	239	216	348
test	493	1008	332	547	325	493	328	525

model	14res	14lap	15res	16res
	F1	F1	F1	F1
FSSI	70.96	57.56	61.08	70.72
Syn	68.76	55.97	58.59	67.86
Sem	69.27	53.49	58.78	68.52