Abstract
Even though various features have been investigated in the detection of figurative language, oxymoron features have not been considered in the classification of sarcastic content. The main objective of this work is to present a system that can automatically classify sarcastic phrases in multi-domain data. This multi-domain dataset consisting of 67850 sarcastic and non-sarcastic data is collected from various websites to identify sarcastic or non-sarcastic utterances. Multiple approaches are examined in this work to improve sarcasm identification: 1. A Combination of fasttext embedding, syntactic, semantic, lexical n-gram, and oxymoron features 2. TF-IDF feature weighting scheme 3. Three machine learning algorithms (SVM, Multinomial Naïve Bayes, and Random Forest), three deep learning algorithms (CNN, LSTM, MLP), and one ensemble model (CNN + LSTM) The CNN + LSTM model achieves a Precision of 91.32%, Recall of 92.85%, F-Score of 92.08%, accuracy of 92.01%, and Kappa of 0.84 by combining the fasttext embedding, bigram, syntactic, semantic, and oxymoron features with TF-IDF method. These experimental results show CNN + LSTM with a combination of all features outperforms the other algorithms in classifying the sarcasm in both datasets. The sarcasm classification performance of our dataset and another sarcasm news dataset was compared while applying the above model.
Introduction
Individuals and most organizations in the global world utilize informal messages in weblog sites and web forums as a platform for online information exchange and communication. Beyond sending emails, face-face interaction, and voice calls, text messaging is the easiest, cheapest, and fastest way for educated individuals to get involved with social media networks. Nowadays social media networks are one of the sources of multimodal data, with users using many mixes of expression by publishing multimedia content such as images, audio, videos as well as textual data to share the information with figurative language content. A figurative language extraction is an interesting as well as difficult task in NLP and computational linguistics. Recognizing literal and symbolic meaning is difficult for a machine, and in some circumstances, it is challenging for educated people as well. As a result, creative and flawless systems smart enough to recognize figurative language are essential. It is impossible to manually read every one of the reviews, and comments, and decide which opinions are sarcastic. Moreover, the ordinary reader will struggle in identifying the sarcasm in product reviews and tweets, which may end up misunderstanding them. In spoken dialogue, sarcasm can be recognized with facial expressions or particular representations but for written communication, no such clues can be found which makes sarcasm challenging to detect. For conveying an undesirable attitude, positive words are utilized in sarcastic messages.
To abstain from giving an answer mockery is utilized as avoidance, where individuals utilize exceptional articulation, entangled sentences, and so on. The hearty comprehension of mockery in a talked exchange framework needs a reformulation of the discourse director’s fundamental suppositions behind, for instance, client conduct and establishing processes. At the point when an individual is extremely annoyed, he utilizes mockery as a yowl where he alludes to a negative context utilizing positive articulations or the other way around. Individuals use mockery as the mind by misrepresenting, utilizing exceptional types of addresses and tones dissimilar to those when he typically talks. It is vital to have a fundamental grasp of NLP, which seeks to obtain, detect, and generate human languages such as Chinese, Czech, English, Hindi, and so on, for sarcasm analysis in the text.
In a few cases of automatic detection of sarcasm, world knowledge is mandatory and also it uses hyperbole. Sarcastic messages express an undesirable opinion about a target using positive words. The online Oxford dictionary defines sarcasm as “the use of irony to make or convey contempt”. Collins dictionary defines sarcasm as “mocking, contemptuous, or ironic language intended to convey scorn or insult”. The Free Dictionary defines sarcasm as “a form of verbal irony that is intended to express contempt or ridicule”. Due to the metaphorical nature of sarcasm, it is often cited as a challenge to sentiment analysis. Merriam-Webster defines it as “The use of words that mean the opposite of what you want to say especially to insult someone, to show irritation, or to be funny”. Examples of sarcastic sentences with oxymoron words were followed. Quick scan room confirms area man The world united in the desire for The Department of Railway Sale department is under the impression that Keebler elves are a beloved part of They can’t host This hall is
Tokenization, Parsing, and Part-of-speech tagging are important tasks realized in natural language processing, which are used for sarcasm recognition. To our best knowledge, the concept of oxymoron features with machine learning, deep learning, and ensemble models is not considered in automatic sarcasm classification. The main aim of this project is to classify sarcastic sentences and non-sarcastic sentences from the balanced multi-domain dataset. The next objective is to collect the various types of oxymoron words from different kinds of resources. Another objective is to analyze the significance of various combinations of features in the detection of sarcasm and non-sarcasm over multi-domain data. Finally, develop the best-optimized model for predicting sarcasm over the multi-domain balanced dataset. The remaining of this experimental research work is planned as follows. Part 2 explains the past and recent works in figurative language detection, oxymoron in sentiment, machine and deep learning, and ensemble algorithms. Part 3 explains the proposed architecture, how to create the multi-domain dataset, extract and reduce the various kinds of features, and classification analysis based on ML and DL, and ensemble techniques. Part 4 explains the results obtained while testing data in the ML and DL models and ensemble models. Finally, Part 5 describes the concluding important remarks and the future scope of this experimental research.
Related work
Kumar Ravi et al. [7] proposed an ensembled text feature technique taken after another system in the worldview of text and data mining to consequently distinguish sarcasm, irony, and satire found in news and client surveys. David Bamman et al. [3] demonstrated that by including additional linguistic data from the setting of an articulation on Twitter for example the audience, the direct communicative environment, and properties of the author and the achieved gains in precision contrasted with absolutely linguistic features in the discovery of this complex phenomenon. Mondher Bouazizi et al. [10] proposed a pattern-based way to deal with recognizing sarcasm on social media networks. Four groups of features were characterized and utilized those features to order tweet content as sarcastic and non-sarcastic.
Chun-Che Peng et al. [9] investigated the possibility of classifying sarcasm in text quality and recognizing regular textual features from Twitter that are essential for sarcasm in the process. Aishwarya n Reganti et al. [1] primarily focused once distinguishing the key parts and features for automatic parody identification. They suggested a site for summed-up semantic features that gathers give best outcomes for diverse sorts of claiming corpus. The first English-Hindi code-mixed dataset of tweets marked for sarcasm and irony where each token was also annotated with a language tag was presented by Swami et al. [17]. Using multiple word and character-based features, a baseline classification method for sarcasm detection in English-Hindi code-mixed tweets was provided. The procedures used to gather and annotate tweets at both the tweet and token levels for language and sarcasm were described.
From the Weibo microblogging website, Shi et al. [16] selected 5000 Weibos for each class, such as negative, neutral, and positive, to describe sentiment categorization on Weibo data. They compare the efficiency of several feature combinations as well as various ML methods. In their trials, Naive Bayes, SVM, and CNN models are used to assess the four categories of events to determine group emotions, and CNN achieves the highest score in terms of precision, recall, and F-measure. As part of the process of designing the framework for automatically identifying sarcastic tweets from Twitter, Ashwitha A et al. [2] have developed a list of target words and objective words that show sarcastic words based on context and directly recognize whether the target words were used exactly or sarcastically. Using LIME and SHAP models, Kumar A et al. [8] demonstrated how few words contribute to the accurate detection of sarcasm in each utterance of a real-time dialogue and how capturing inter-sentence context can accurately predict sarcasm. Mukherjee S and Bala PK [11] focused on extracting relevant features from sentences that are dependent on both what was said online and who wrote it. They found that the function words features outperformed skip n-grams and POS tags features among the authorial style-based features. They discovered that fuzzy clustering methods are less effective for detecting sarcasm than the Naive Bayes classification method. Yafeng Ren et al. [20] studied two context-enhanced neural models based on CNN. To detect tweets expressing sarcasm on Twitter, a context-enhanced neural model can be used that includes contextual information, as well as a context-enhanced neural model that combines all contextual information. Erik Forslid et al. [6] presented a bootstrap algorithm that naturally learns phrases with negative events, or situations, combined with positive sentiments from Twitter information with the hash label sarcasm. This algorithm relies on the suspicion that negative situations regularly show up after positive situations in sarcastic texts. Elisabetta Fersini et al. [5] proposed an ensemble approach – based on the Bayesian model Averaging paradigm. Rossano Schifanella et al. [14] investigated two programmed techniques namely Support Vector Machine (SVM) and mentioned the name for multimodal sarcasm detection. To discover the features of English and Chinese sarcastic sentences, and to handle the imbalanced dataset problem, Liu Peng et al. (2014) proposed a multiple-strategy ensemble approach and introduced a new set of features. Davidov et al. [4] used semi-supervised algorithms to identify sarcasm in Twitter data and consumer reviews from the Amazon website. On both datasets, they achieved good precision, recall, and F-Score even with cross-domain training and without the need for domain adaptation. To capture the sentiment semantics and to capture the contrast between sentiment semantics and the situation in each sentence, Ren L et al. [13] proposed first-level memory networks and second-level memory networks. Additionally, they analyzed local information, contextual information, and the information that can capture the contrast between sentiments in the sentence through a convolutional neural network designed specifically to detect sarcasm expressions. Hu su et al. [19] proposed the new CNN model with pre-trained double embedding techniques such as general purpose embedding (GloVe-CNN) and domain-specific embedding for aspect extraction. Tsao et al. [18] examined the composition that makes up the oxymoron type and created an original oxymoron. Principal component analyses of contradictory-image testing for overall perceptions of usage, operating method, and product feedback were carried out to study the contradictory image components of the creative oxymoron examples. The properties that correspond to parts of speech were identified to develop design conversion models.
Proposed design of sarcasm detection
The sarcasm classification is the process of training a model to categorize or label data into predefined class such as sarcastic or non-sarcastic. The general steps include data collection, data preprocessing, features extraction, data splitting, model selection, model training, model evaluation, hyper parameter tuning are involved in this sarcasm classification system. The phases of the proposed architecture of sarcasm sentence detection are depicted in Fig. 1.

The Proposed design of sarcasm classification based on the CNN + LSTM model with the combination of all features.
Obtaining high-quality data is crucial when working with text categorization, but it can be challenging when dealing with sarcastic sentences. The scope of this research work covered only sarcastic and non-sarcastic sentences composed in the English language. Sarcastic tweets, news, feedback, and product reviews were gathered for this research work from a variety of social media platforms, including Twitter, news websites, online shopping websites, and other sources. To get the dataset for this sarcasm classification, both manually laborious retrieval methods and automatic retrieval of Tweets applying the keywords approach were used. There are two alternatives available. Either creates a stream and gathers tweets in real-time or use a search query to gather all tweets that meet that query. The hashtags sarcasm or sarcastic (#sarcasm, #sarcastic) are used to stream all sarcastic, and non-sarcastic tweets are also collected in the same way but without the use of hashtags.
The following are the standard procedures for obtaining tweets from Twitter: 1. Go to the Twitter Developer website, create a Twitter developer account, and log in using Twitter credentials. 2. Develop a new Twitter application that offers the required tokens and keys. 3. To use the Twitter API, install the Tweepy and TwitterAPI Python packages. 4. To authenticate the application, use tokens (access token, access token secret) and API keys (consumer keys, consumer secret). 5. Gather tweets based on hashtag using the Twitter API. 6. Keep the gathered tweets in a database and export the tweets in CSV file format.
Sarcastic news, sentences, and product reviews are gathered from a variety of sources. The sarcastic product reviews, news headlines, and sentences were limited to 50 words and never shorter than 20 words. Data from several fields are gathered, integrated into a single complicated dataset, and then manually classified as sarcastic or non-sarcastic by three experts after that. In this research, this complex dataset is a balanced distribution of 67850 volume of English dataset that contained 33925 sarcastic and 33925 non-sarcastic instances are involved to evaluate the performance of particularly three machine learning Classifiers and three deep learning algorithms, and the proposed CNN + LSTM model. In this work, the complicated sarcastic dataset is divided into training and testing portions at an 80/20 ratio.
Feature extraction and reduction
FastText is a word embedding generator algorithm for learning word representation. It considers every single word to be formed by n-grams of character. This kind of representation of a word is helpful to find the vector representation for unusual words. It will help manage the vector representation for words not existing in the dictionary. These FastText word embedding techniques overcome the problem of out-of-vocabulary. It is particularly suited for dysphemistic comments, abusive comments, as well as sarcastic comments. Toxic comments often use dysphemism words, for example, “Daughter of an A****,” “***k her!!!!” but also misspelled words, which are common in online discussions. But these dysphemisms and abusive words do not exist in the vocabulary.
After creating the FastText word embedding features, it is fed as input into the next layer for creating the model. Sentiment-based features are mostly very helpful to predict the polarity of sarcastic sentences. Here, counts of highly emotional positive and negative words, positive emotions, negative emotions, and sentiment score of the word were considered as a feature in the sentiment-based features category. SenticNet has been used to construct sentiment-based features. In punctuation-based features, the presence of dots, question marks, exclamation marks, quotes, capital words, and repetition characters in a word was considered for predicting the sarcastic sentence. The extra presence of the above symbols within the text message denotes the presence of sarcasm. In the following example, “Particularly your furniture products are extraordinarily amazing!!!”. Here, the excessive presence of exclamation marks and repetition of character z in amazing words is a punctuation-based feature. In the class of syntactic features, the presence of the number of common words, interjections, laughing expressions, noun, verb, adjective, adverbs, and words in a document was considered to predict sarcasm in a document.
An oxymoron is a figurative expression that combines two opposing terms. The term oxymoron derives from Greek. It combines the words oxy (meaning “sharp”) and moron (meaning “dull, stupid, or foolish”). For achieving rhetorical effects, oxymorons may be helpful, as in working vacations and uninvited guests. They may also be the result of conceptual sloppiness, such as the same difference, original copy, or extremely average. When the meanings of the conflicting parts are not notable, an oxymoron may go unobserved, as in Artificial Intelligence, virtual reality, and spendthrift. Typically, this type of contradiction is determined by considering one term to be the lower attribute of a superior idea. For example, an impartial opinion is a type of opinion; “no comment” is not considered a comment, and an accurate estimate is a type of estimate. Oxymorons are more than just linguistic oddities. Words are far from being apathetic bystanders in the world. They can shape their users’ insights and direct their actions. Totally 3365 oxymoron words were collected from various sources. Table 2 shows the list of some oxymoron feature words. Different classes of features have been considered for detecting sarcasm, in machine learning, deep learning, and an ensemble algorithm. The overall features group, list of features, and description are shown in Table 1.
Overall features group and its features list
Overall features group and its features list
List of some oxymoron feature words
Example Oxymoronic Sentences My wife and her niece had a He looks The painters left the door The children enjoyed being He is the most This was This is an
To classify the problem instances appropriately, the SVM algorithm requires selecting the superlative hyperplane. The combination of unigram, bigram, sentiment, punctuation, syntactic, and oxymoron features of sarcastic contents included in the model training could correctly attribute the significance to the classification result. In our SVM classification experiments, the Python library Scikit-learn was used for the sarcasm classification. The hyper-parameters like kernel, gamma, and regularization C were considered for finding optimized results.
The Radial basis function is a basic kernel function generally used in the SVM algorithm. RBF can map the input to infinite dimensional space. Even though the gamma value is 0.1 is recommended to be an optimum default value, after several iterations, we found that the important hyper-tune parameter gamma is set to 0.2. The higher value of gamma leads to over-fitting. Regularization parameter in Python’s Scikit-learn C parameter used to maintain regularization. Regularization C is the penalty parameter, which represents misclassification or error term. After several iterations, we set the value 5 to the C parameter. After finding the best model, a test feature data set was given to this model to predict the correct class. The same optimized parameters were recommended in entire experimental settings to measure the performance of sarcasm identification. The optimized parameter and tuned value of each parameter are depicted in Table 3.
Model parameters of machine learning in sarcasm classification
Model parameters of machine learning in sarcasm classification
The steps of the proposed architecture of sarcasm sentence detection are followed. First, the complex sarcasm dataset is collected from various resources, and then the sentences are annotated as sarcastic or non-sarcastic by 3 annotators. Some important preprocessing is done using the Python environment. Fasttext embedding word vector features, Bi-gram, sentiment, punctuation, syntactic, and oxymoron features were extracted. Only Embedding word vector features are fed into the first convolutional layer. It was converted into convolved features using the Relu function. Then feed into the max pooling layer. 1st pooling layer (Max) output was fed into the second convolutional layer. Then average pooling was used. 1st pooling layer (Max) output was fed into the second convolutional layer. Then average pooling was used. The output of the second pooling layer feed into two stacked LSTM layer. Output features of the LSTM network layer, lexical, sentiment, punctuation, syntactic, and proposed oxymoron features were fused in the flattening layer. Next, the fully connected layer was used with 4096 neurons. Finally, the softmax function was used in the output layer to check whether the sentence is sarcastic or non-sarcastic.
To determine the performance of Convolutional Neural Network architecture, first, compare it with two deep learning classifier models including LSTM and MLP. Their corresponding configurations are presented in Table 5. The LSTM model was explicitly chosen to fill a long-term dependency learning gap. The word-level Fasttext embeddings were initialized with a 300-dimensional vector representing the input data and used ReLU as the default activation function. The hidden layer units are set to 256. In this experiment, the Adam optimizer was used for training, the epoch number is 50, and the batch size is 256. This LSTM model uses the softmax function to classify the sentence as sarcastic or non-sarcastic. In the Multilayer Perceptron network, 256 neurons and rectified linear units activation functions were used in the hidden layer unit. Adam Solver works well in a large dataset for weight optimization. The values of the parameters alpha, batch size, learning rate, random state, no of epochs, and epsilon are set into 0.0001, 200, constant, none, 50, and1e-08 respectively. All important optimal hyper-parameters of the CNN algorithm were mentioned in Table 5. All layers (embedding, convolution, pooling, fully connected) of the proposed CNN+LSTM architecture and the dimensions of each layer are mentioned in Table 4.
Layers and its dimensions of proposed CNN + LSTM architecture
Layers and its dimensions of proposed CNN + LSTM architecture
Deep learning and ensemble learning parameters in sarcasm classification
All machine learning, deep learning, and ensemble experiments were carried out to examine the sarcasm in a given multi-domain dataset and Misra & Arora sarcasm news dataset [15]. For implementing the preprocessing, feature extraction and reduction, and creating the models, Windows 10 with 64-bit operating systems and Python programming language were used. The machine is configured with an Intel Core i7-4770 CPU running at 3.400GHz and 16 GB of RAM. Features group and list are explained in section 3.2 and different combinations of features have been given as input to all machine and deep learning, and ensemble algorithms in the sarcasm analysis. The same set of hyper-tuned parameters was employed on different combinations of features to create the best model. To evaluate the outcome of these investigations, four performance measures were used: recall, precision, F-measure, and accuracy.
Result analysis based on machine learning
In these experiments, the combination of Unigram, sentiment, punctuation, and syntactic features are called baseline 1 features. The combination of Unigram, Bi-gram, sentiment, punctuation, and syntactic features are called baseline 2 features. In the Random forest model, when combining the baseline 1 features set with oxymoron features, it achieved the F-score of 53.51 %. Only get a 3.03 % improvement over the F-score. Here, it means that the baseline 1 features set alone are not capable of detecting sarcasm. Interestingly, when the baseline 2 features set merged with the oxymoron features, an F-score of 56.43% was obtained. Without using the oxymoron features set, Random forest models get only 54.81 % F-Score. 1.62 % improvement gets over the F-score. When the baseline 1 features + oxymoron features compare with baseline 2 features + oxymoron features, a 2.92 % F-score gets improvement. This indicates that baseline 2 features along with oxymoron features achieve a very lower F-score rate. From the value of Kappa rate 0.12, the random forest model is not suitable for detecting the sarcasm in this dataset and also it takes a large amount of time to train this model.
Baseline 1 feature, baseline 1 + oxymoron feature, baseline 2 feature, and baseline 2 + oxymoron feature receive F-Score values of 54.92%, 56.53%, 58.77%, and 59.62% when used in the Multinomial Naive Bayes method, respectively. It is observed that when the performance compare between Random Forest and Multinomial Naïve Bayes algorithms, the combination of baseline 2 features with oxymoron features along with the Multinomial Naïve Bayes algorithm gives better performance. The Multinomial Naïve Bayes algorithm achieves a 3.19 % F-Score improvement over the Random Forest algorithm. But this Multinomial Naïve Bayes model also got a kappa rate of 0.19. This multinomial naïve Bayes algorithm has also failed to detect the sarcasm in this balanced dataset. Our dataset is non-linearly separable. So, Radial Basis Function in the SVM was used. In this experiment, the SVC function from the Python library scikit-learn was used.
The best F-Score obtained is 70.88 % when the combined Unigram, Bi-gram, sentiment, punctuation, syntactic, and oxymoron features (Baseline 2 features + Oxymoron features) were considered in the support vector machine algorithm. F-score increases from 61.03 % to 70.88 %, an improvement of 9.85 % when the baseline 2 features along with oxymoron features. On the other hand, our proposed oxymoron features perform consistently well on this dataset. Another important impact is that Unigram features are not as good in the detection of sarcasm compared to Bi-gram lexical features. But the SVM achieves the 0.41 kappa rate when baseline 2 with oxymoron features was considered. Experimental results for the three machine learning algorithms (RF, MNB, and SVM) along with different sets of features (Baseline 1, Baseline 1 + Oxymoron, Baseline 2, and Baseline 2 + Oxymoron) are shown in Table 6. Finally, from these results, it is observed that this produced poor results in the detection of sarcasm in the dataset. So, deep learning algorithms, and ensemble algorithms on this same dataset are employed to this above said task.
Performance of our dataset using machine learning algorithms with different features set
Performance of our dataset using machine learning algorithms with different features set
MLP, LSTM, and CNN models are trained and evaluated for various batch sizes and epochs in this experimental work. These three models are trained with Adam optimizer, batch size 32, 64, 128, 256, and 0.0001 learning rate. An early halting strategy is used to automatically optimize the number of training epochs. TensorFlow early stopping algorithm identifies when to stop training the model by analyzing the validation error. By tweaking epoch numbers, it is possible to avoid excessive over-fitting of hyper-parameters. Dropout, a regularization strategy, is also used to overcome over-fitting problems in all models. It is clearly understood the performance of various deep learning models (MLP, LSTM, and CNN) and CNN + LSTM model on our dataset from the precision, recall, F-score, and accuracy in Table 7.
Performance of our dataset using deep learning and ensemble algorithms with different combinations of features
Performance of our dataset using deep learning and ensemble algorithms with different combinations of features
The highest accuracy of MLP, LSTM, CNN, and CNN + LSTM is 73.62%, 86.74%, 90.33 %, and 92.01% respectively. The kappa value of CNN + LSTM is 0.84. Table 7 outlines the performance improvements obtained by incorporating the oxymoron features of overall deep models and ensemble models. As seen in Table 7, incorporating the oxymoron features, and all the above-mentioned features with the deep models and ensemble model result in accuracy was improved. However, significant increases are observed in CNN + LSTM accuracy with 1.68% than CNN. The most significant accuracy was achieved in ensemble learning algorithms than in machine and deep learning algorithms. The performance of CNN + LSTM exceeds all the other models Multilayer Perceptron, Long Short Term Memory, CNN, and conventional three machine learning algorithms. It also suggested that the proposed CNN + LSTM along with the Fasttext embedded features, and Baseline 2 + Oxymoron features can be potentially used for detecting the sarcasm in this multi-domain dataset. The performance of the proposed CNN + LSTM model along with Fasttext embedding features, baseline 2 + oxymoron features was investigated against some approaches for sarcasm detection. Our proposed CNN + LSTM along with all features achieved an improved accuracy of 1.68%, 5.27 %, 18.39%, and 21.32% than CNN, Long Short Term Memory, Multilayer Perceptron, and Support Vector Machine respectively in our dataset. The performance comparison between three deep learning classifiers and one ensemble model in the sarcastic prediction task is revealed in Fig. 4. Table 8 shows the performance of deep learning and ensemble algorithms in the Misra & Arora sarcasm news dataset. Figure 2 shows the accuracy rate of CNN + LSTM, CNN, LSTM, and MLP while executing these experiments using different epochs on Misra & Arora [18] sarcastic news dataset. While setting 50 to the parameter epoch, these four algorithms get the highest accuracy. Figure 3 shows the performance in terms of accuracy while applying the deep and ensemble model with different epochs on our sarcasm dataset. In the sarcastic news dataset also, the proposed oxymoron features with CNN + LSTM get the highest accuracy.

Accuracy of Misra & Arora sarcasm News dataset using deep & ensemble learning models.

Accuracy of our dataset using deep and ensemble learning models.

Comparison of accuracy of deep learning and ensemble model using a combination of all features with proposed oxymoron features.
Performance of Misra & Arora sarcasm dataset using deep learning and ensemble models with different combinations of features
Figure 4 shows the comparison of the accuracy of MLP, LSTM, CNN, and CNN + LSTM while the combination of fasttext embedding, punctuation, sentiment, syntactic, bigram, and oxymoron features was employed on both datasets. Figure 5 shows the performance of all algorithms with all features except oxymoron on both datasets. The CNN + LSTM model with oxymoron features gets a 1.77% improvement.

Comparison of accuracy of deep learning and ensemble model without using proposed oxymoron features.
Sarcastic speech has been much more prevalent on social media in recent years. Because of this significant increase, researchers are turning to automated algorithms to identify and classify sarcastic content. The goal of this study is to anticipate sarcastic tweets using a fresh dataset, deep learning methods, and ensemble methods. The oxymoron features and fasttext word embedding features with basic features had a noteworthy role in sarcasm detection. The CNN + LSTM ensemble model with our proposed oxymoron features achieved the highest accuracy in sarcasm classification compared to other algorithms.
It is planned to keep expanding this sarcastic dataset and adding other annotation procedures in the future to make it appropriate for other NLP tasks. Despite the promising results, we will include various deep learning-based models like gated recurrent units (GRU) and some combinations of deep learning models, and transformers to better the task of sarcastic identification in terms of statistical regularities in words. To further this research, we will explore ways to identify subtle sarcasm in tweets that contain both English and other languages. Finally, it may examine the sources and targets of sarcastic remarks by analyzing the sources and targets of hate speech by age, gender, and place.
