Abstract
Human beings often use figurative language during communication to express their thoughts. Uncovering the meaning out of figurative language is not as simple as literal language. Humor identification is considered to be an important linguistic device for sentiment analysis of figurative text because it can often change the sentiments of the text. Moreover, during verbal communication people use facial expressions, gestures and other modalities to convey their feeling and to automatically understand the meaning out of figurative sentences using these modalities is part of computer vision and digital image processing. It is difficult for written sentences where facial expressions, gestures, other modalities, and emotions are absent and so it is an interesting question of research. Humor is a figurative device and a creative linguistic phenomenon. To understand the meaning of humor, we need to correctly understand the mood and emotions conveyed in the text, which is beyond the semantics of literal language communication. In this work, we have addressed these issues of understanding the emotions using affect-based information from text with various well established machine learning classifiers. We have exploited various affective content that inhibits the emotions and feeling of a writer such as emoticons, writing styles like punctuation, capitalization, sentiment words and so on. The proposed affect-based humor identification model is evaluated on the SemEval 2017 HashTagWars dataset and yelp review dataset with different types of the experimental configuration. This evaluates the effectiveness of the proposed humor identification model with different types of features.
Introduction
Language is the crucial part of our lives and the essential aspect of communication. With the growth of internet, people use social media as a platform to express and share their feelings with the world. They often use language in creative manner and apply figurative language devices like humor, sarcasm or irony to express their feelings. Natural language with literal intent is easier to process compared to figurative language due to complex nature of words and semantics in figurative language. The figurative devices used in text can invert the meaning and polarity of expressed thought, which can influence the accuracy of the system. Thus it is interesting to analyze and study about how the language is used for communication.
Analyzing textual content is an interesting task that has been studied by several disciplines [1]. However, automated natural language understanding systems find it challenging to read between the lines for creatively used language. Human communication often involves tone, pitch, and emphasis to convey the affects. In psychology, affect is a concept to describe the experience of feeling or emotion [2]. It is also termed as an attitude or emotion that the speaker brings to communication [3]. Different expressions of feelings such as joy, respect, anger, sympathy, contempt, gratitude, boredom, respect, wonder, disgust, humility and pity are often conveyed through various paralinguistic mechanisms like facial expressions, vocal tones and pitches, gesture and actions [4]. It is beyond thumbs up or down. Affect-based information conveyed in communication is need to be considered. For example, certain sentences given below convey totally different meaning, if communicated with different affects (tone/facial expression/gesture/action):
Vishal is back at home. (A neutral expression)
Vishal is back at home!
(Expressing happiness)
Vishal is back at home...
(Expressing sarcastic thought)
Vishal is back at home
(Expressing love)
Vishal is back at home!?
(Expressing fear)
Vishal is back at home!!!
(Expressing happily surprised)
It is difficult to make sense of affective information in the above textual sentences, because the context can be varying based on tone and utterance of the particular word. Human can experience around such 34000 different emotions and can express during communication. Dr. Robert Plutchik, an American psychologist have proposed eight primary emotions that serve as foundation for others [5]. To correctly understand the meaning of such natural language statements, it requires to capture all affective elements such as tone, pitch, facial expression etc. from the statements during the communication. Affect involves both what language users communicate and how they do it [1]. Extracting emotions through tone, pitch, gesture or facial expression are part of computer vision and digital image processing [4]. But extracting emotion from written text is often a challenging task. Often texts on social media posts and online customer reviews contain user emotions towards product or topics. Writing style of user, multi-lingual texts and variability in language makes it difficult to extract emotion from the text. Words when used in different context (different senses) can convey different emotions. Moreover, utterance can convey more than one emotion without implicitly or explicitly expressing the feelings of speaker. Social media texts are rich with misspellings (e.g. parlament), creatively spelled words (e.g. happeeee), emoticons ( etc.) and hashtags. On social media, people tend to use figurative language to express their opinions that is difficult for automatic natural language systems to interpret due to sarcasm, humor, irony or metaphor. Moreover, most learning algorithm require significant amount of labeled data for training. Detecting emotions in texts can be difficult for even human beings [1]. Modeling of affect is trickier than opinion mining because of the fuzzy and complex nature of emotion. Opinion mining just deals with positive/negative valence of certain opinion, which is a simplified model of affect intensities [6]. According to Mohammad et al. [7], affect is also subject to contextual variation such as for affect perception, individuals interpretation may vary depending on mood, personality, emotional intelligence, gender etc. These challenges need to be addressed for developing automated natural language understanding system. Based on these challenges, the following research questions are identified: 1. to differentiate literal and figurative language and 2. To identify the key elements at linguistic level which distinguish the literal language from figurative language.
In this work, affect-based contents are identified from text and exploited for humor identification task to address these research questions. The proposed model is evaluated against state-of-the art approaches using various experimental configurations. Section 2 explores the psychological and engineering perspective of affects, affect representation methods, different affective information in figurative language, lexical resources and evaluation measures. In Section 3, implementation details about the proposed work on affect based humor identification, features and details of experimental setups are described. In Section 4, analysis and discussion on results of proposed work is carried out. Finally, Section 5 concludes the work and possible future work.
Related work
Emotion is a complex psycho-physiological experience of an individual’s state of mind. It interacts with biochemical (internal) and environmental (external) effects. In humans, emotion fundamentally involves “physiological arousal, expressive behaviors, and conscious experience”. In this section, formal emotion representation models, different affective contents present in language and lexical resources used for identifying affect contents in text are described. Different evaluation measures used for evaluating humor identification approaches are also described at the end.
Affect representation
To correctly identify the emotional content from text by computers, the text must be represented formally to show mapping on to the psychological model. Many emotion models have been developed for formal representation of emotions in psychology. They can be broadly categorized into 1) Discrete emotion space models [8] and 2) continuous emotion space models [9–11].
Categorical/discrete model of emotions
It is also known as categorical model of emotions. A layman usually describes the emotional experience using simple words like “happy” or “sad”. Categorical model of emotion is discrete in nature, which describes the emotion as distinct labels. It is easier to label content with a single emotion. In this area, Ekman’s work [12] is one of the important basis for some of the recent research on emotions. He introduced six basic emotions: “happiness”, “anger”, “sadness”, “fear”, “disgust” and “surprise”, and any other emotions can be composed by a combination of these six basic emotions. The simplest categorical classification of emotions decides whether the emotion is positive or negative. These type of models are used in creation of linguistic resources. Because to annotate the samples with emotional information, simple categories of emotions are used. For example, SentiWordNet [13] uses positive, negative and objective labels for annotation. WordNet-Affect [14] uses labels from Ekman’s emotion model with neutral emotion category added. For recognizing textual emotions, neutral category is also considered in many categorical emotion models. Some texts convey neutral (no emotion) feel. There are certain drawbacks of categorical model. Simple category labels are unable to reflect the valence or arousal of emotion unless it is modified.
Dimensional/continuous model of emotions
It is also known as dimensional model of emotions. The dimensional model of emotions uses point or region within a two or multi-dimensional space to represent each emotion. Thus emotions in dimensional model does not belong to particular single category. The number of dimensions required are decided according to application and goals.
The most basic dimensional emotional model is described by Russell et.al [15] represents emotion using two dimensions, where the one dimension represents emotion polarity (valence) and the other dimension represents activation/arousal of emotion from low to high. Arousal dimension characterizes an emotion as activated or deactivated. For example, “anger” and “fear” have “negative” valence, while “joy” has “positive” valence. Arousal reflects the extent of reaction to stimuli from low to high.
Ekman’s [12] basic six emotions are based on categorical model emotions. This categorical labeling of emotions can be extended by providing scale to each of the emotions. It will give flexible emotion vector. Each emotion category will have value corresponding to specific emotion. Value in that vector corresponds to the emotion intensity. Instead of using positive or negative labels simply for emotion categories, a single emotion with range –1 to 1 is be used to transform categorical model to dimensional. Negative (less than 0) values correspond to negative emotion and positive (above 0) values correspond to positive emotions. And the values denote arousal of certain emotion. It means values less than 0 denotes arousal (activation) of negative emotions and values greater than zero denotes arousal positive emotions.
2-dimensional emotional space has often been used to model the smooth passage from one state to another in an infinite set of values [16]. To represent affective video content, Hanjalic and Xu [9] proposed improved version of 2-dimensional valence-arousal emotional space.
Unlike, 2-dimensional emotion model, 3-dimensional emotion model includes control dimension. This dimension is used to make distinction between two emotions which have similar valence and arousal. Although dimensional emotional model can represent rich affective states as pairs of (valence, Arousal, Control) or (valence, Arousal), it is very difficult for people to present their emotional experiences using the values of valence, arousal and control. Comparatively, people are more comfortable to make use of simpler words like happy, sad to depict their emotion.
Affective information in figurative language
Major cues of affect in communication are facial expression, gesture, actions, voice tone, and pitch, etc. But to find affects cues in textual data is a complex task and requires a great level of observation. Different linguistic cues such as capital letters, emoticons, punctuations, etc. are used for expressing any emotion in text. In the following section different types of linguistic sources to get affective information are categorized into three groups: 1) Emoticon based affective information, 2) sentiment-based affective information and 3) linguistic structure-based affective information.
Emoticon based affective information
Emoticons are a pictorial representation of facial expressions (e.g. smile, laugh, etc.) and moods (e.g. angry, curious, etc.) using various combinations of keyboard characters used to convey the writer’s feelings and tone in the text [3]. Emoticons act as a substitute for facial expression in written text communication. So emoticons can be viewed as a great source that conveys affective information in the text. So they can be good markers of the figurative use of language. Recently in literature, many researchers have utilized emoticons as a marker to figurative language.
Researchers at MIT developed an emoji-based distant supervision neural network model called DeepEmoji [17]. It detects sentiment and other affective states from short social media messages [17]. SemEval 2018, a global semantic evaluation work-shop introduced a task of emoji prediction in 2018 [18] for correctly predicting emoji for a given text, where 49 teams have submitted their results in evaluation of their approach for this task
Sentiment related affective information
The polarity of words in sentence greatly influence the generation of humor. To catch the writer’s attitude, sentiment related information (polarity: positive, negative or neutral) expressed in text and writer’s emotion for the text need to be checked. To detect the sentiment-based affective information, any kind of contradiction occurring among the sentiments of words and other components in the text needs to be checked. These sentiment contents of words and other linguistic units are extracted and counted with the help of different resources of sentiment. The presence of humor can be identified by polarity contrast between positive sentiment verb and a negative situation or vice versa [19]. Joshi et.al determined a rule-based sarcasm detector [20] using sentiment related information in the given text.
The figurative use of language is a big challenge in sentiment analysis. Because the presence of the sarcastic sense can change the sentiment of the sentence and may affect the sentiment analysis accuracy. But many systems use a sentiment as the surface feature for detecting humor in the text. Bharti et al. [21] described a rule-based approach that predicts sentence as sarcastic if negative phrase occurs in a positive sentence. For the detection of figurative contents in the text, Gupta et al. [22] have used the different 24 types of sentiment related information from text, such as counts of positive and negative words, counts of different strength words as per rating.
Reyes et al. [23] capture polarity in terms of two emotion dimensions: activation and pleasantness. Joshi et.al [24] exploited sentiment incongruity as an important feature for sarcasm detection. In that work, explicit and implicit sentiment congruity is used for sarcasm detection. Explicit sentiment in-congruity is sentiment incongruity expressed using the presence of sentiment words of both positive and negative polarities. An implicit sentiment incongruity occurs when the sentiment incongruity occurs without the presence of sentiment words of both polarities. For example, ‘I love this paper so much that I made a doggy bag out of it’. In this sentence, no sentiment contrasting words are present, though it’s sarcastic. Number of sentiment flips, i.e. the number of times positive words followed by negative words and vice versa, largest positive/negative word sub-sequence, number of positive words and negative words, etc. determines the explicit sentiment information [24].
Linguistic structure based affective information
Different structural information available in the text such as punctuation marks, upper case letters, lower case letters etc. good markers of presence of figurative content [23]. Certain lexical markers help writers to point out the sense and meaning in the text. Patti et al. [25] exploited different punctuations, length of words, discourse markers and frequency of different Part-Of-Speech labels for identifying sarcastic and ironic contents using affective features. Percentage of upper case letters, percentage of question marks, and percentage of exclamation marks are found to be good indicators of ironic contents in text. Ellipses are also used as good marker of presence of figurative content because it is generally used by writer to skip certain words. The presence of commonly used words and rare words in text are good markers of presence of unexpectedness and incongruity. Humorous text generally contains incongruity and unexpectedness.
Linguistic information to identify affects in figurative texts
Very often, building a classifier relies on having large and accurate linguistic resources. Improving these dictionaries, by preserving their size and increasing the annotation accuracy, is, therefore, considered mandatory. In literature, many linguistic re-sources are produced to date to identify affective information from the given text. They can be categories into two categories: 1) automatically created resources of affect and 2) manually created resources of affect
Automatically created resources generated using distant supervision or bootstrapping methods. The following resources are automatically created resources of affect: Sentiwordnet [13], WordNet-Affect [14], NRC Twitter lexicons, ConceptNet [26] etc. Manually created lexical resources of affect: Dictionary of affect, Affective norms for English words (texts), Harvard general inquirer categories, NRC emotion lexicon, MaxDiff sentiment lexicon.
Among the existing linguistic resources, WordNet [27], or variations over it, remains the most popular one [28]. Wordnet, built at Princeton University, is used in most of the Natural Language Processing applications. The concepts in Wordnet are grouped into synonym sets (also called synsets), which are sets of words semantically linked. Each synset description contains its frequency in the dictionary and a glossary which is basically a short sentence describing the synset. Among the basic synonymic relations, WN also contains some special relations such as hyponymy, hyperonymy or ISA (“is a”). All these links describe generalization, specialization or equivalence relationships between synsets. As a synset database example, WordNet-Affect (WNA) [14] is an extension of Wordnet. Wordnet-Affect contains synsets annotated with emotional labels (i.e. Ekman’s basic annotation scheme [14]: Anger, Disgust, Fear, Happiness, Sadness and Surprise). WNA contains nouns, adjectives, adverbs and some verbs for the English WN 2.0 version. ConceptNet [26] is an-other well-known ontology widely used for semantic disambiguation in classification tasks. This database contains assertions of common-sense knowledge encompassing the spatial, physical, social, temporal, and psychological aspects of everyday life. ConceptNet was generated automatically from the Open Mind common sense project.
Finally, SentiWordNet [13] is dedicated to opinion and valence classification. Valence is represented by the degree of positivity, negativity or objectivity of a certain word or sentence, whereas opinion represents the general valence over a series of sentences. SWN is the result of a semantic propagation algorithm overall WN synsets according to their valence. All these linguistic resources are used, among other applications, to design affective classifiers.
SemEval 2007 task 14, [29] presented a corpus and some methods to evaluate it, based on WNA as dictionary [29]. In particular, methods based on a fusion of algorithms, using Keyword Spotting, Lexical Affinity or Statistical Natural Language Processing are very popular [30]. In [31], different resources such as EffectWordnet [32], SenticNet [33], EmoSenticnet [34], EmoLex [35] and AFINN are used in irony detection task of SemEval 2018. In [36], researchers have used different resources of NRC for irony detection, Such as, NRC Affect Intensity lexicon [37], NRC emotion lexicon, NRC hashtag sentiment lexicon and so on.
Evaluation measures
Most commonly used evaluation measures for humor identification are precision, recall, f-score, accuracy. The important parameters used by these evaluation measures are, 1. True Positive (TP): when sentence actually belongs to humor class and predicted as humor. 2. True Negative (TN): when sentence not actually belongs to humor class (i.e.it belongs to not-humor class) and predicted as not-humor. 3. False Positive (FP): when sentence belongs to not-humor class and falsely predicted as humor. 4. False Negative (FN): when sentence belongs to humor class and falsely predicted as not-humor.
Precision can be given by Equation (1). Higher precision values indicate fewer occurrences of False Positive (FP) values.
Recall can be given by Equation (2). Higher recall values indicate fewer occurrences of False Negative (FN) values. It is also known as True Positive Rate (TPR).
Precision and recall are not independent measures. System recall can be easily increased by labeling class values at the cost of precision and vice versa. Thus F-score is used which considers both the precision and recall to balance. It can be given by Equation (3).
Accuracy is the proximity of measurement results to the true value. It can be given by Equation (4).
In this section, proposed humor identification framework with feature groups used in framework has been described. The novelty of this work is to utilize affect-based content from written text for humor identification. Humor identification in text is implemented as a task of classification in this work. The problem of humor identification can be described as follows:
Given unlabeled sentences, with h=h1, h2, ... ,hn humor features, the task is to identify label of sentence as “humor” or “not-humor”.
Different set of affect based features are proposed to model humor identification in this work. The proposed humor identification model is evaluated on benchmark datasets and compared with state-of-the art approaches. This validates the robustness of the proposed humor identification model under different datasets, where samples of humorous occurrences were collected by using different criteria. The proposed work on humor identification model mainly focuses on feature extraction step of the machine learning process. The flow diagram of the whole humor identification model containing affect based features is given in Fig. 2.

Emotion representation on Rusell’s [15] 2-dimensional emotion model.

Flow diagram of proposed humor identification model using affect-based features.
Various state-of-the art classification algorithms such as, Support Vector Machine (SVM), Naïve Bayes (NB) and Multilayer Perceptron (MLP) are used in this work for classifying sentences. The reason to use these classifiers in this work is due to their good performance in various text classification problems. Various pre-processing on input text such as stop-words removal and punctuation removal is not performed in this experiment. Because some sequence of such words and punctuations contribute in deciding presence of humor. Removal of citations, URLs and hashtags from tweets are done as preprocessing while working with SemEval-2017 dataset.
This work focuses on identifying humor from text, so feature extractor extracts total 53 features in this humor identification model to predict the humor label after initial preprocessing. We have observed different aspects of humor in text, like writers writing style, special emotion word usage, playing with word synonyms and ambiguity which creates humor. These 53 features are into mainly categories into four groups: Affect based features Incongruity features Stylistic features Ambiguity features
Among these feature sets, 40 affect-based features are proposed and implemented for humor identification. These affect-based features captures information related to writer’s emotional state, mood and intention as described in section 2. Here, affect-based features are collection of 24 features which exploit structure-based information from text and 16 emotion related features, which are extracted based on two different emotion lexicons, namely, EmoLex [35] and EmoSenticNet [34]. The list of structure-based features and emotion features used as affect-based features are listed in Table 1.
Summary of proposed affect based features along with all other features used in humor identification task
Summary of proposed affect based features along with all other features used in humor identification task
From EmoLex 8 basic emotion categories are identified. 8 basic emotion categories in EmoLex are: anger, disgust, trust, anticipation, surprise, joy, fear, sadness. Apart from this, positive or negative emotion can also be identified from the text. Because occurrence of positive labeled word in negative context or negative labeled word in positive context represents the sign of incongruity. So total 10 types of features can be extracted from data using EmoLex as shown in Table 1. EmoSenticNet is also another emotion lexicon which can identify six basic emotions: anger, disgust, fear, joy, sadness and surprise. If the word in the given sentence match with any of the word belonging to these emotion categories, that word is given some count. The scores of all emotion based features are calculated by summing up scores on occurrence of word to respective categories of emotion (joy, anger, disgust, fear, trust, positive, etc.)
Incongruity features, stylistic features and ambiguity features are adapted from [38]. Incongruity features checks incongruous or in compatible words in text. Ambiguity features are important to capture humor in the text as humor is found in two cases: 1) when text has different interpretation and 2) those interpretations are opposed to each other. Disambiguation of words with multiple meanings is a crucial component in many humor jokes. Stylistic features are used to detect signatures, unexpectedness, existence of smiley characters and style of writer, which is useful for identifying humor in given text.
Linguistic structure related information can also be useful to identify the writer’s feeling in the text by observing the punctuation marks, length of text, special part-of-speech tags used, and rare word usage as mentioned in section 2. The Table 1 describes list of all the features used in this work and the category of the feature. In feature extraction, other lexical resources are also used such as WordNet and American National Corpus. Written-spoken features and ambiguity related features are extracted with the help of American national corpus.
Based on score of all features for the given text, the final score of the sentence is calculated. It decides the label for that sentence is “humor” or “not-humor” using the classifier. If the label is “humor” then based on probabilities of class for ranking it decided as 1 or 2 (ranking from humorous to more humorous).
For the evaluation purpose of humor identification, datasets from different domains are selected, namely, SemEval-2017 Hashtag wars dataset and Yelp review dataset.
SemEval-2017 Hashtag wars dataset is a collection of tweets released by SemEval-2017 Task-6 [39] from television show #HashTagWars Comedy Central @midnight for humor identification in English language. It contains tweets on different hashtags given by host. Based on that hashtag participant need to tweet something which includes that hashtag. Viewer also posts their tweets related to that hashtag. Among those tweets funnier tweet is declared on the show. Training data consists of 101 files of total 11325 tweets and test data consists of 6 files of total 749 tweets. Each file contain tweets containing the particular hashtag, shortly mentioned as, Christmas, Shakespeare, Bad job, Break up, Broadway and Cereal in Table 2.
Results (accuracy) of proposed humor identification model on Semeval2017 HashTagWars dataset
Results (accuracy) of proposed humor identification model on Semeval2017 HashTagWars dataset
Yelp Review Dataset is released as challenge dataset. It consists of 1.6 million reviews by 366,000 users for 61,000 businesses. Each review consists of one or more sentences commenting on the business at hand, along with votes given by other users to the review –particularly, “funny”, “useful”, and “cool”. We will be considering the “funny” category. For this experiment we have taken the instance of this dataset for evaluation. We have extracted features for 6000 sentences. Due to skew data in this dataset, we have taken this dataset in different proportions of training and testing parts.
We performed experiments in following settings to review the hypothesis. It explores the power of our proposed humor identification model for written text.
Hypothesis 1: how is the performance of proposed humor identification model in comparison with the performance of other models in literature?
Experiment 1: we train the humor identification model with implementation of simple classifiers from Weka such as, naïve bayes, multilayer perceptron and support vector machine with different combination of feature sets and compared the performance with other models in literature in terms of accuracy and f-score. We compare our proposed humor identification model with other SemEval 2017 HashTagWars participating systems [39]. We used support vector machine with radial basis function kernel. This experiment was conducted on balanced dataset (50–50% samples of each class) of Yelp reviews. Oversampling and under sampling was also performed to balance the SemEval-2017 hashtag wars dataset.
Hypothesis 2: does the proposed humor identification model benefit from affect-based features?
Experiment 2: we first setup humor identification model with naïve bayes, multilayer perceptron and support vector machine classifiers with individual set of features to see the effectiveness of each feature types on identifying humor from written text. The results of individual features are compared to check the effectiveness of affect-based features. The performance of proposed humor identification model with different feature groups are observed in Table 4.
Performance (accuracy) comparison of proposed humor identification model with other models on Semeval2017 HashTagWars dataset
Performance (accuracy) comparison of proposed humor identification model with other models on Semeval2017 HashTagWars dataset
Results of different feature sets on yelp review dataset with presence of 50–50% of each humor/not-humor samples (balanced dataset)
All the experiments are conducted on 4GB RAM, 64-bit Intel core i5 2.5 GHz machine with Windows operating system. Moreover, the performance of proposed humor identification model is observed on different training /testing data split: 80% /20%, 50% /50%, 70% /30%. We used support vector machine in both the experimental settings with radial basis function kernel. Multilayer perceptron is built with 0.1 learning rate for back propagation algorithm and 500 training epochs.
We conducted experiments to study effectiveness of individual type of features as well as proposed humor identification model in comparison with other models in literature.
Experiment 1: the detailed result of running experiment-1 is presented in Table 2. It is carried out on separate 6 files of different hashtags of Semeval-2017 Hashtag wars dataset. The dataset was originally much skewed. We conducted experiment on original dataset as well as under sampled and oversampled dataset. The goal of conducting this experiment was to compare proposed humor identification model with other models in literature. The performance comparison of the proposed humor identification model with other models in literature is shown in Table 3. Best average accuracy of the proposed humor identification model, as shown in Table 2, is considered for comparison with other models. Proposed humor identification model with Support vector machine has scored highest accuracy 0.751 compared to other participating systems in SemEval-2017 on the original dataset. Table 3 represents the comparison of various humor identification model based on accuracy on SemEval 2017 hashtag wars dataset. First column in table represents the name of participating humor identification model in SemEval 2017. Second column represents the accuracy of each model on SemEval 2017 hashtag wars dataset for predicting humor/not-humor. Proposed humor identification model scored best among all participating teams [39].
The proposed model is able to predict better (0.741) for tweets related to Shakespeare. Here Christmas movie, Shakespeare quote, celebrity/Broadway play, or cereal/song are the hash tags which requires extra knowledge to identify humor. On these identified hashtags except for Broadway results are not promising. So it requires some features to be added that can deal with extra information related to these hashtags. These results are taken by considering all the features.
Experiment 2: the results shown in Table 4 depicts the effectiveness of proposed affect-based features for predicting humor/not-humor. Results shown in bold represents the best among all feature sets on particular train/test scenario of dataset (80% /20%, 70% /30%, 50% /50%). To show the best results on all affect-based features, we made them bold for all train/test scenarios of dataset. Using all 40 affect-based features, proposed model performed best with support vector machine [40]. The results are 0.7051, 0.7044 and 0.7033 F-scores in 80% –20%, 70% –30% and 50% –50% of training-testing data. In this experiment, results using ambiguity features are best (0.7224) among all the feature sets. This shows that ambiguity is one the good marker of presence of humor in text.
Separately emotion-based features and structure-based features did not perform very well to predict humor/not-humor but they are proved comparatively effective when used together (all 40 affect features). Other humor identification models in literature scored 93.1% and 98.6% f-score for puns and shot jokes dataset [41]. We carried out this experiment on more realistic data generated by user.
Conclusion
We addressed the issue of modeling using humor identification using affect based information conveyed in target text. The detailed study about affect representation, different affective information conveyed in written text and available lexical resources to identify affects from text are explored here in the directions of research for proposed affect based humor identification in written texts.
In this work, we have focused on understanding figurative language through humor identification. We proposed and implemented humor identification model based on set of affect-based features (emotion-based features+structure based features) and other 13 feature. Evaluation of proposed humor identification on two totally different kinds of dataset with various experimental conditions proves its robustness. In experiment, the proposed humor identification approach achieved competitive performance. During evaluation in different experimental setup, ambiguity features from other feature set performed best among all feature sets. Support vector machine achieved competitive result to predict humor/not-humor on SemEval-2017 dataset.
Based on this work, we identified possible direction of research as: 1) exploring other figurative devices like humor in written natural language text 2) Emoji/Emoticon and other recent modalities in texts which are used for expressing of emotions need to be explored for this task.
