Cyberbullying detection through deep learning: A case study of Turkish celebrities on Twitter

Abstract

One of the ways that celebs maintain their fame in the modern era is by posting updates and photos to social media platforms like Twitter, Instagram, and Facebook. Comments left on their posts, however, expose them to cyberbullying. Cyberbullying, as a form of electronic device-based harassment, negatively impacts the lives of individuals. Thirty famous people from the fields of acting, art, music, politics, sports, and writing were chosen for this research. These notable figures include the top five Twitter followers of Turkey in each demographic. Between December 2019 and December 2020, comment responses for each celebrity were collated. Using the Deep Learning model, we were able to detect abuse content with an accuracy of 89%. Additionally, the percentage of celebrities exposed to cyberbullying by group was presented.

Keywords

Cyberbullying deep learning Twitter long short-term memory social media analysis

1. Introduction

With the Internet being an indispensable part of our lives, personal social media accounts are actively used more than ever. Although the desire of people to share their personal opinions and interests has made social media an important source of information, the ideas shared through these personal accounts are not always welcomed positively. In some cases, negative comments that we call cyberbullying are also seen. Cyberbullying is defined as deliberate and continuous actions taken in an aggressive manner against vulnerable persons using many electronic methods such as the internet, e-mail, text message, blog, and social media messages [1]. In general, the most common acts of cyberbullying involve issues such as sexuality, gender difference, disability, racism, terrorism, personal character, belief, behavior, appearance, and weight [2].

Cyberbullying is a common occurrence on the internet. Generally, politically based views are subject to cyberbullying. Various studies estimate that 10% to 40% of internet users are victims of cyberbullying [3]. This value is really a remarkable rate. Some artificial intelligence detection studies are being carried out to overcome this problem. In this study, cyberbullying is detected from Twitter data through deep learning. First of all, 5 people followed mostly were selected from each professional group in Turkey. These professional groups consist of actors, artists, politicians, singers, sports, and writers. There are many responses to the tweets shared by these people on Twitter. A dataset was created by collecting the responses given to the tweet post by each Twitter user between December 2019 and December 2020. In this way, a total of 2.741.848 tweets were obtained from 30 Twitter users.

The study is conducted with Turkish tweets. To the best of our knowledge, there are not many sources on the Turkish twitter dataset. Although sufficient numbers of English study and dataset sources are available, Turkish sources are very limited. It is important that it is a study conducted with nearly 3 million tweets in Turkish. In addition, we observed that the most followed celebrities on Twitter are exposed to cyberbullying. Such a study on Turkish Celebrities has not been realized before. The results are also presented statistically. The statistical results of cyberbullying are observed between 3.48% and 0.24% which can be considered a high measure of exposure.

Cyberbullying is a common problem that primarily affects people who use social media. Even if it is not physical, being subjected to cyberbullying has a negative impact on people. Cyberbullying causes psychological symptoms such as sadness, loneliness, low self-esteem, suicidal ideation, and depression [4]. Cyberbullying is considered a crime and has legal consequences. In this study, it is seen that the cyberbullying that celebrities are exposed to is quite high. Especially politicians are exposed to hundreds of cyberbullying tweets. Apart from politicians, other groups such as singers, actors, and writers are exposed to cyberbullying a lot.

In the study, labeled datasets from previous studies were examined. These data were obtained by combining the data of two different studies [4,5]. Before applying deep learning models, we trained the data with shallow classifiers such as; Naive Bayes, AdaBoost, Random Forest, and Decision Tree. Among these models, the highest accuracy rate was obtained with Decision Tree as 86%. Following that, deep learning approaches were employed with labeled same data consisting of 14.114 tweets, and accuracy of 89% was obtained. Cyberbullying was detected by applying the obtained model to all professional groups. As a result, we observed that professional groups are exposed to cyberbullying with 1.66% of the replies to the tweets they share.

The remainder of the paper is organized as follows. Section 2 provides a brief review of the related work. In Section 3, materials and methods for the study are introduced with technical details. The result of the cyberbullying detection for Turkish celebrities is presented in Section 4 with related accuracy and the F1 Score of the model. Finally, Section 5 provides a discussion about the study, and concluding remarks are sketched in Section 6.

Table 1
Summary table: studies of cyberbullying detection with social media data

Paper DataSet Type Algorithm used Max accuracy

Xujuan Zhou et al. [6] 5,700 tweets ML Tweet Sentiment Analysis Model, MAchine Learning for LanguagE Toolkit –

Diri et al. [7] 6,800 tweets ML Dictionary, N-gram 72.70% (N-gram)

Öztürk [8] 15,658 tweets ML Naïve Bayes, SVM, Decision Tree, J48, Random Forest 91.30% (Naive Bayes)

Çürük [9] 13,158 Formspring.me | 1,753 Myspace | 3,463 Youtube ML ANN –

Bandeh Ali Talpur et al. [10] 24,189 tweets ML Naïve Bayes, SVM, Decision Tree, Random Forest, KNN 91.15% (Random Forest)

Balakrishnan et al. [11] 5,453 tweets ML Naïve Bayes, Random Forest, J48 92.88% (J48)

Munerr et al. [12] 37,373 tweets ML Logistic Regression,LGBM, SGD, Random Forest, SVM, AdaBoost, Naïve Bayes 90.57% (Logistic Regression)

Agraval et al. [13] 2,100,000 tweets DL RNN 90% (RNN)

Al-Ajlan et al. [14] 20,000 tweets DL CNN –

Chatzakou et al. [15] 12,000 Formspring.me | 16,000 tweets | 100,000 Wikipedia DL CNN, LSTM, BLSTM, BLSTM –

Sadiq et al. [16] 20,000 tweets DL Combination of CNN-LSTM and CNN-BiLSTM 92% (Combination)

Gamback et al. [17] 6,655 tweets DL CNN 86.68% (CNN)

Aroyehun et al. [18] 17,174 facebook posts DL CNN, LSTM, BiLSTM, and combinations thereof –

Chatzakou et al. [19] 1,600,000 tweets RS extracting text, user and network-based attributes 91.25%

Paper	DataSet	Type	Algorithm used	Max accuracy
Xujuan Zhou et al. [6]	5,700 tweets	ML	Tweet Sentiment Analysis Model, MAchine Learning for LanguagE Toolkit	–
Diri et al. [7]	6,800 tweets	ML	Dictionary, N-gram	72.70% (N-gram)
Öztürk [8]	15,658 tweets	ML	Naïve Bayes, SVM, Decision Tree, J48, Random Forest	91.30% (Naive Bayes)
Çürük [9]	13,158 Formspring.me \| 1,753 Myspace \| 3,463 Youtube	ML	ANN	–
Bandeh Ali Talpur et al. [10]	24,189 tweets	ML	Naïve Bayes, SVM, Decision Tree, Random Forest, KNN	91.15% (Random Forest)
Balakrishnan et al. [11]	5,453 tweets	ML	Naïve Bayes, Random Forest, J48	92.88% (J48)
Munerr et al. [12]	37,373 tweets	ML	Logistic Regression,LGBM, SGD, Random Forest, SVM, AdaBoost, Naïve Bayes	90.57% (Logistic Regression)
Agraval et al. [13]	2,100,000 tweets	DL	RNN	90% (RNN)
Al-Ajlan et al. [14]	20,000 tweets	DL	CNN	–
Chatzakou et al. [15]	12,000 Formspring.me \| 16,000 tweets \| 100,000 Wikipedia	DL	CNN, LSTM, BLSTM, BLSTM	–
Sadiq et al. [16]	20,000 tweets	DL	Combination of CNN-LSTM and CNN-BiLSTM	92% (Combination)
Gamback et al. [17]	6,655 tweets	DL	CNN	86.68% (CNN)
Aroyehun et al. [18]	17,174 facebook posts	DL	CNN, LSTM, BiLSTM, and combinations thereof	–
Chatzakou et al. [19]	1,600,000 tweets	RS	extracting text, user and network-based attributes	91.25%

ML: Machine Learning, DL: Deep Learning, RS: Robust Statistics

2. Related work

Today, numerous studies in the fields of natural language processing and data mining have been conducted using Twitter data. With these data, sentiment analysis and cyberbullying detection are performed. We have arranged the studies in the literature by their distinguishing characteristics and presented them in Table 1. As an example of these studies, Xujuan Zhou et al. [6] proposed a method that integrates pining mining and context-based topic modeling to analyze Twitter data containing public opinions on social media. The study was conducted using Tweets data from the 2010 Australian Federal Election. Diri et al. [7] performed a sentiment analysis with Twitter, tweets captured with a certain keyword are automatically tagged as positive, negative, and neutral with both the dictionary and the n-gram model. The dictionary and character-based n-gram methods used were approximately 70% and 69% successful, respectively. Öztürk [8], in his thesis, determining cyberbullying for Turkish, created the largest Turkish dataset ever to detect cyberbullying texts and to show the effects of preprocessing, feature selection, and classifiers for the detection of cyberbullying from texts. In this study, the pre-processing step was applied and information acquisition and filter-based methods such as chi-square were used in feature selection. Among the classifiers, Naive Bayes was determined as the most successful method in detecting cyberbullying from texts written in Turkish. In another thesis, Çürük [9] conducted on the detection of cyberbullying with artificial intelligence algorithms to investigate the effects of preprocessing, feature extraction, feature selection, and classification methods on cyberbullying detection. In the classification section, different classifiers based on Artificial Neural Network (ANN) are proposed and their results are compared with each other. Chi2, RFE, MRMR, and ReliefF algorithms have been proposed as feature selection. Bandeh Ali Talpur et al. [10], used Twitter data to detect cyberbullying. While conducting this study, they tried the supervised machine learning method. It applied the Embedding, Sentiment, and Lexicon along with the PMI-semantic orientation. The extracted features were implemented with Naïve Bayes, KNN, Decision Tree, Random Forest, and Support Vector Machine algorithms. The study conducted by Balakrishnan et al. [11] tried to detect cyberbullying by taking advantage of the psychological characteristics of Twitter users’ emotions and feelings. User personalities were determined using the Big Five and Dark Triad models, while Naïve Bayes, Random Forest, and J48 machine learning classifiers were used to classify tweets into one of four categories. The results show that cyberbullying detection increased when personalities and emotions were used, but no similar effect was observed for emotion. Muneer et al. [12] experienced 7 machine learning algorithms with 37,373 unique twitter data sets. These algorithms consist of Naive Bayes, Random Forest, Logistic Regression, Support Vector Machine (SVM), Stochastic Gradient Descent (SGD), AdaBoost, and Light Gradient Boosting Machine (LGBM). F1 score, accuracy, precision, and recall values were found for all algorithms.

In the study, Agrawal et al. [13] conducted the first systematic analysis of cyberbullying detection using data from various social media platforms. Dataset consists of 12 thousand Formspring, 16 thousand Twitter, and over 100 thousand Wikipedia data. In another deep learning cyberbullying study, Al-Ajlan et al. [14] conducted the study using Twitter data. In the study, feature extraction and classification methods were not used, but the word vector was experienced. In this way, it was aimed to preserve the meaning of the word. In another cyberbullying study on social media, Chatzakou et al. [15] analyzed 2.1 million tweets of 1.2 million Twitter users of discussions on normal topics and compared these topics to more specifically selected hate-related topics. In another deep learning cyberbullying study, Sadiq et al. [16] applied a multi-layer perceptron method on deep learning, which is a combination of CNN-LSTM and CNN-BiLSTM. The statistical results showed that the model works with 92% accuracy. Gamback et al. [17] performed two CNN models created based on different input vector sets that were fed to the neural networks for training and classification. Word vectors based on semantic information were built utilizing an unsupervised strategy, word2Vec, and compared to a randomly generated vector baseline. Aroyehun et al. [18] used Facebook posts in order to develop a baseline model and a number of deep neural network models. They experimented with deep learning models of complexity ranging from CNN, LSTM, BiLSTM, CNN-LSTM, LSTM-CNN, CNN-BiLSTM to BiLSTM-CNN. Chatzakou et al. [19] tried to detect bullying and aggressive behavior using twitter data. In order to distinguish the offensive-type people from normal people, they proposes a robust methodology with the aim of extracting text, user and network-based attributes.

Our study is based on the tweets sent to the most followed celebrities on Twitter. Celebrities consist of 30 people from different professions that are the most followed in Turkey. Labeled data, which was previously used in other studies, was used for machine learning and deep learning. Not only machine learning models were used, but also deep learning. The trained models were applied to the twitter data of 30 celebrities. It is the first study with 3 million tweets for the 30 most followed celebrities in the Turkish language. The results of the study were also shown statistically.

3. Materials and methods for study

3.1. Data accessibility

The training twitter data set was created by combining the labeled data sets of two different studies. This data set consists of 14.114 data and the label has the values 1 (bullying) and 0 (not bullying). 80% of the data set was evaluated for training and 20% for testing. For twitter data that we will detect cyberbullying, the most popular celebrities were determined on Twitter. For this, socialbakers website which keeps statistical data was used [20]. The data of these twitter users were collected with the application called “twint”. Twitter users’ tweets and replies can be obtained with the advanced tool written in Python language, without the need for an API [21]. Replies of identified twitter users’ tweets were collected from December 2019 to December 2020. Table 2 shows how many tweets were obtained for which Twitter user. Consequently, a total of 2.741.848 tweets from 30 twitter users were collected.

Table 2
Tweet replies counts for 6 professional groups

Type Person Tweets

Writers Metin Uca 37.075

Ayşe Arman 704

Ahmet Hakan 61.497

Cüneyt Özdemir 22.932

Uğur Dündar 74.300

Politics Recep Tayyip Erdoğan 831.317

Fahrettin Koca 894.765

Kemal Kılıçdaroğlu 215.583

Ahmet Davutoğlu 173.126

Abdullah Gül 15.680

Sports Nuri Şahin 411

Gökhan Töre 4411

Gökhan Gönül 11.233

Burak Yılmaz 2.465

Arda Turan 17.897

Actors Okan Bayülgen 5.505

Levent Üzümcü 88.455

Cem Yılmaz 60.762

Hülya Avşar 2.470

Ata Demirer 6.744

Singers Sıla Gencoğlu 2.878

Gülben Ergen 6.352

Demet Akalın 184.477

Tarkan 62

Mustafa Ceceli 283

Artists Tolga Çevik 233

Serkan Altuniğne 10.930

Erdil Yaşaroğlu 1.967

Bedri Baykam 2.521

Ali Sunal 4.813

Type	Person	Tweets
Writers	Metin Uca	37.075
Ayşe Arman	704
Ahmet Hakan	61.497
Cüneyt Özdemir	22.932
Uğur Dündar	74.300
Politics	Recep Tayyip Erdoğan	831.317
Fahrettin Koca	894.765
Kemal Kılıçdaroğlu	215.583
Ahmet Davutoğlu	173.126
Abdullah Gül	15.680
Sports	Nuri Şahin	411
Gökhan Töre	4411
Gökhan Gönül	11.233
Burak Yılmaz	2.465
Arda Turan	17.897
Actors	Okan Bayülgen	5.505
Levent Üzümcü	88.455
Cem Yılmaz	60.762
Hülya Avşar	2.470
Ata Demirer	6.744
Singers	Sıla Gencoğlu	2.878
Gülben Ergen	6.352
Demet Akalın	184.477
Tarkan	62
Mustafa Ceceli	283
Artists	Tolga Çevik	233
Serkan Altuniğne	10.930
Erdil Yaşaroğlu	1.967
Bedri Baykam	2.521
Ali Sunal	4.813

3.2. Preprocessing

Contents that are shared on social media often do not include grammar rules. That’s why preprocessing is significant before working on data. Figure 1 shows the result of the reply to a tweet after preprocessing. Fetched Twitter data with the Twint tool is in corrupted format. After the date, hashtag, Twitter username, image, and video links had been removed from the data set, the stop words were removed and all twitter content was converted to lowercase letters. All the pre-processed Twitter data were saved to the file in txt format. The same procedure was applied for labeled training data.

3.3. Proposed model

The purpose of the proposed approach is primarily to create a deep learning model with the labeled twitter data and then apply this model to the Twitter data that we gathered. Labeled Twitter data is divided into bullying or not bullying. First, 14.114 labeled twitter data was run for Machine Learning classifiers Naive Bayes, AdaBoost, Random Forest, and Decision Tree. Among these classifiers, Decision Tree got the highest success value with 86%. Naive Bayes and AdaBoost had 84% accuracy, and Random Forest had 85% accuracy. Later, the same labeled data were used for Deep Learning.

Fig. 1.

Preprocessing steps.

Fig. 2.

System architecture.

According to the system architecture of deep learning shown in Fig. 2, Word2Vec, one of the most popular techniques applied for feature extraction is used to capture the context of a word in a sentence or document.

The proposed LSTM (Long Term Short Term Memory) model’s architecture is shown in Fig. 3. The model consists of an embedding layer, the LSTM layer, and a Dense layer, a neural network fully connected with sigmoid as the activation function. Besides, dropouts and batch normalization are added between layers in order to prevent overfitting. Long Short-Term Memory networks, often called LSTMs, are a type of RNN that can learn long-term dependencies. LSTM was introduced by Hochreiter & Schmidhuber and It is especially preferred in the field of text mining.

Fig. 3.

Model structure.

3.4. Innovation

The sequential model was preferred because the output layer has one output tensor. We applied the tweets to Tokenizer() function. Tokenization means splitting the given sentence into an indexed or vectorized list of tokens. TensorFlow and Keras were used for modeling. In order to pass inputs of the same size, we used pad_sequences() function. We used LSTM (Long Short-Term Memory) for this model. It is a modified and sophisticated architecture of the RNNs. LSTMs can do high-range modeling dependencies with better accuracy than CNNs (Convolutional Neural Networks). Our architecture model has four main parts. We start with the embedding layer then we have the LSTM layer with 0.5 Dropout. Third, 0.2 Dropout, BatchNormalization(), and again 0.2 Dropout were added to avoid the overfitting problem. Finally, Dense (fully connected layers) were added for classification purposes and used a sigmoid activation function before the final output.

The model is trained for 5 epochs which attains a validation accuracy of 89%. For the LSTM model, a structure with 50 neurons was established and the dropout value was chosen as 0.5. Relu activation function was added between dropouts and batch normalization.

The obtained model was then applied to test data that we collected with Twint tool. This model was run for the Twitter data of 5 people from each profession group following the most on Twitter. The cyberbullying situation of these people will be shown in the next part, the evaluation.

Table 3
Twitter cyberbullying rate for 6 professional groups

Type Person Tweets Bullying Tweets Bullying rate

Writers Metin Uca 37.075 1.291 3.48%

Ayşe Arman 704 4 0.57%

Ahmet Hakan 61.497 1.985 3.23%

Cüneyt Özdemir 22.932 312 1.36%

Uğur Dündar 74.300 804 1.08%

Sports Nuri Şahin 411 1 0.24%

Gökhan Töre 4.411 42 0.95%

Gökhan Gönül 11.233 177 1.58%

Burak Yılmaz 2.465 78 3.16%

Arda Turan 17.897 435 2.43%

Singers Sıla Gencoğlu 2.878 43 1.49%

Gülben Ergen 6.352 67 1.05%

Demet Akalın 184.477 3.230 1.75%

Tarkan * 62 0 0.00%

Mustafa Ceceli 283 2 0.71%

Politics Recep Tayyip Erdoğan 831.317 3.866 0.47%

Fahrettin Koca 894.765 3.065 0.34%

Kemal Kılıçdaroğlu 215.583 5.832 2.71%

Ahmet Davutoğlu 173.126 5.983 3.46%

Abdullah Gül 15.680 252 1.61%

Actors Okan Bayülgen 5.505 76 1.38%

Levent Üzümcü 88.455 1.930 2.18%

Cem Yılmaz 60.762 872 1.44%

Hülya Avşar 2.470 84 3.40%

Ata Demirer 6.744 39 0.58%

Artists Tolga Çevik * 233 0 0.00%

Serkan Altuniğne 10.930 158 1.45%

Erdil Yaşaroğlu 1.967 59 3.00%

Bedri Baykam 2.521 55 2.18%

Ali Sunal 4.813 122 2.53%

Type	Person	Tweets	Bullying Tweets	Bullying rate
Writers	Metin Uca	37.075	1.291	3.48%
Ayşe Arman	704	4	0.57%
Ahmet Hakan	61.497	1.985	3.23%
Cüneyt Özdemir	22.932	312	1.36%
Uğur Dündar	74.300	804	1.08%
Sports	Nuri Şahin	411	1	0.24%
Gökhan Töre	4.411	42	0.95%
Gökhan Gönül	11.233	177	1.58%
Burak Yılmaz	2.465	78	3.16%
Arda Turan	17.897	435	2.43%
Singers	Sıla Gencoğlu	2.878	43	1.49%
Gülben Ergen	6.352	67	1.05%
Demet Akalın	184.477	3.230	1.75%
Tarkan *	62	0	0.00%
Mustafa Ceceli	283	2	0.71%
Politics	Recep Tayyip Erdoğan	831.317	3.866	0.47%
Fahrettin Koca	894.765	3.065	0.34%
Kemal Kılıçdaroğlu	215.583	5.832	2.71%
Ahmet Davutoğlu	173.126	5.983	3.46%
Abdullah Gül	15.680	252	1.61%
Actors	Okan Bayülgen	5.505	76	1.38%
Levent Üzümcü	88.455	1.930	2.18%
Cem Yılmaz	60.762	872	1.44%
Hülya Avşar	2.470	84	3.40%
Ata Demirer	6.744	39	0.58%
Artists	Tolga Çevik *	233	0	0.00%
Serkan Altuniğne	10.930	158	1.45%
Erdil Yaşaroğlu	1.967	59	3.00%
Bedri Baykam	2.521	55	2.18%
Ali Sunal	4.813	122	2.53%

twint was unable to extract enough data for this user.

3.5. Statistical evaluation

The test dataset was created by collecting the responses given to the tweet post by each twitter user between December 2019 and December 2020. In this way, a total of 2.741.848 tweets were obtained from 30 twitter users. Table 3 shows the number of replies to each Twitter user’s share and the number of bullying twitter replies.

Statistical evaluation was calculated by percentage. As seen in equation (1), bullying tweets were divided into tweets and multiplied by 100. In this way, it was found that the percentage of each celebrity was exposed to cyberbullying. Statistical calculation was done for two purposes; First, who are the most cyberbullied celebrities? The second is which profession group of celebrities is exposed to more cyberbullying. According to Table 3, the 3 most exposed to cyberbullying are; Metin Uca (3.48%), Ahmet Davutoğlu (3.46%) and Hülya Avşar (3.40%). When we calculate it as a professional group, the first 3 are formed as follows; Writers (1.94%), Artists (1.83%), and Actors (1.80%). Celebrities who are least exposed to cyberbullying are as follows; Nuri Şahin (0.24%), Fahrettin Koca (0.34%) and Recep Tayyip Erdoğan (0.47%). Those who are least exposed to cyberbullying as a professional group; Singer (1.00%), Sports (1.67%), and Politicians (1.72%). $\begin{matrix} (1) & \frac{Bullying Tweets}{Tweets} * 100 = Bullying Rate \end{matrix}$

In particular, writers are the most cyberbullied, while singers the least. Government officials in Turkey while less exposed, opposition politicians are exposed to more cyberbullying. In Fig. 4, we see the percentage of cyberbullying that all celebrities are exposed to.

Fig. 4.

Bullying rate by celebrities.

4. Results

Firstly, machine learning algorithms were tried in the study. Deep learning was chosen because similar machine learning studies were included in previous literature studies. In deep learning studies, CNN is generally used in image processing, while LSTM is used in text processing. That is why LSTM deep learning was chosen in this study.

As seen in Table 4, the AUC value was calculated as 0.95. This value was more effective than the values of 0.943 in [22], 0.817 in [23] and 0.815 in [19]. Besides, accuracy value was 0.89 and F1 Score was 0.87 after 5 Epochs. After the model was created, tweets of 30 celebrities belonging to 6 professional groups were used as inputs to the model. The results of celebrities with fewer records were checked by observation. Accordingly, it can be said that the results yield correct outputs. The most cyberbullied group was writers, while singers were the least. The names that are exposed to cyberbullying the most individually are MetinUca (3.48%), Ahmet Davutoğlu (3.46%), and Hülya Avşar (3.40%).

Table 4
Accuracy, AUC, recall, precision and F1 score of proposed model

Accuracy AUC Recall Precision F1 score

Epoch 1/5 0.53 0.52 0.23 0.48 0.31

Epoch 2/5 0.66 0.72 0.45 0.68 0.54

Epoch 3/5 0.78 0.86 0.75 0.77 0.76

Epoch 4/5 0.85 0.93 0.81 0.85 0.83

Epoch 5/5 0.89 0.95 0.86 0.89 0.87

	Accuracy	AUC	Recall	Precision	F1 score
Epoch 1/5	0.53	0.52	0.23	0.48	0.31
Epoch 2/5	0.66	0.72	0.45	0.68	0.54
Epoch 3/5	0.78	0.86	0.75	0.77	0.76
Epoch 4/5	0.85	0.93	0.81	0.85	0.83
Epoch 5/5	0.89	0.95	0.86	0.89	0.87

5. Discussion

The present study provides cyberbullying detection against Turkish celebrities who have the most followers. To do this, we fetched 2.741.848 tweets replied to 30 celebrities by the Twint tool. This study is significant as this kind of cyberbullying detection had never been done before for the celebrities who have the most followers in Turkey. On the other hand, the compelling side of the study is that a large amount of data is collected from Twitter.

We used the ready dataset used for other machine learning studies in the stage of our training and test. The labeled dataset has 14.114 tweets. On the prepossessing side, we used the word2vec method in order to vectorize tweets. This method uses two combined methods, Skip Gram and Common Bag Of Words (CBOW).

LSTM, being the most used deep learning type in text classification studies was preferred. An extra dropout was added to forget some neurons, and then “Batch Normalization” was performed. Batch normalization was preferred because of its regulator. “Relu” was preferred as the activation function. We also tried “sigmoid” and “softmax” activation functions. For softmax, the accuracy values remained low, while the validation loss value did not go down to the desired level in the sigmoid function. Relu activation function results were observed more accurately than both. When the model was run, “binary cross-entropy” was selected as the loss function and “adam” was the optimizer.

As we tested the Twitter data of celebrities with the model, we also observed and evaluated the outcomes. There have been profanity and insulting expressions heard in Turkish. The greatest number of tweets containing cyberbullying are directed at politicians. The proportion of writers is the greatest. Statistically speaking, politicians engage in the most interaction on Twitter. They are not, however, the most vulnerable category of professionals to cyberbullying. Cyberbullying affects opposition politicians more than government officials. If we evaluate cyberbullying independently of professional groups, oppositions are, on average, more susceptible to it. There is between 3.48 and 0.24 percent cyberbullying of personalities. When examining the results, we do not identify any celebrities who are subjected to cyberbullying to a significant degree. In general, the values are relatively close.

6. Conclusion

As the popularity of social media platforms continues to rise, it has become clear that many false accounts are being used to spread upsetting cyberbullying content. We need cyberbullying content classification algorithms that work in the background to stop this from happening. There are a number of studies in the literature that address this problem and propose potential solutions, but not nearly enough of these studies examine Turkish social media messages. This is due to a lack of properly categorized social media statistics from Turkey. As can be seen in the part devoted to related work, the majority of cyberbullying research has been conducted in English. The research began by employing labeled pre-existing datasets for training the deep learning LSTM model. Different machine learning models, including Naive Bayes, AdaBoost, Random Forest, and Decision Tree, were tested on the labeled dataset. We found that the deep learning classifier had a better accuracy rate than the rest of the machine learning options. Using the obtained deep learning model on Twitter data collected with Twint, the researchers showed a number of statistical findings. Researchers found that writers were more likely to be targeted by cyberbullies than people in any other field. Still, we get that anybody can be a target of cyberbullying to some extent. Preventative research on artificial intelligence is required alongside cyberbullying monitoring efforts.

Footnotes

Acknowledgement

The Turkish Celebrities Cyberbullying DataSet can be found at GitHub via the following link https://github.com/BulutKaradag/Turkish-Celebrities-Cyberbullying-DataSet.

References

Snakenborg,

R.V.

Acker and

R.A.

Gable, Cyberbullying: Prevention and intervention to protect our children and youth, preventing school failure: Alternative education for, Children and Youth 55(2) (2011), 88–95. doi:10.1080/1045988X.2011.539454.

von Marées and

Petermann, Cyberbullying: An increasing challenge for schools, School Psychology International 33(5) (2012), 467–476. doi:10.1177/0143034312445241.

Whittaker and

R.M.

Kowalski, Cyberbullying via social media, Journal of School Violence 14(1) (2015), 11–29. doi:10.1080/15388220.2014.949377.

S.A.

Özel,

Saraç,

Akdemir and

Aksu, Detection of cyberbullying on social media messages in Turkish, in: 2017 International Conference on Computer Science and Engineering (UBMK), 2017, pp. 366–370. doi:10.1109/UBMK.2017.8093411.

Minus100DataScience, Minus 100 Ekibi Türkçe Doğal Dil İşleme Yarışması Projesi, Açık Kaynak Platformu (2020), https://cutt.ly/LjYNcmL.

Zhou,

Tao,

M.M.

Rahman and

Zhang, Coupling topic modelling in opinion mining for social media analysis, in: Proceedings of the International Conference on Web Intelligence, WI’17, Association for Computing Machinery, New York, NY, USA, 2017, pp. 533–540. ISBN 9781450349512. doi:10.1145/3106426.3106459.

B.D.

Eyüp, Sercan Akgül Caner Ertano, Sentiment analysis with Twitter, Pamukkale University Journal of Engineering Sciences 22(2) (2016), 106–110. doi:10.5505/pajes.2015.37268.

Öztürk, Cyberbullying Detection using Text Classification for Turkish Language, Yüksek Öğretim Dergisi (2019).

Çürük, Cyberbullying detection and classification with artificial intelligence algorithms in social network, Yüksek Öğretim Dergisi (2018).

10.

D.O.

Bandeh Ali Talpur, Cyberbullying severity detection: A machine learning approach, PLoS ONE 15(10) (2020).

11.

Balakrishnan,

Khan and

H.R.

Arabnia, Improving cyberbullying detection using Twitter users’ psychological features and machine learning, Computers & Security 90 (2020), 101710, https://www-sciencedirect-com.web.bisu.edu.cn/science/article/pii/S0167404819302470 . doi:10.1016/j.cose.2019.101710.

12.

S.M.

Muneer and

Fati, A Comparative Analysis of Machine Learning Techniques for Cyberbullying Detection on Twitter, Future Internet 2020 12(187) (2020).

13.

Agrawal and

Awekar, Deep Learning for Detecting Cyberbullying Across Multiple Social Media Platforms, 2018, CoRR, http://arxiv.org/abs/1801.06482. arXiv:1801.06482.

14.

M.A.

Al-Ajlan and

Ykhlef, Optimized Twitter cyberbullying detection based on deep learning, in: 2018 21st Saudi Computer Society National Computer Conference (NCC), 2018, pp. 1–5. doi:10.1109/NCG.2018.8593146.

15.

Chatzakou,

Leontiadis,

Blackburn,

E.D.

Cristofaro,

Stringhini,

Vakali and

Kourtellis, Detecting Cyberbullying and Cyberaggression in Social Media, 2019, CoRR, http://arxiv.org/abs/1907.08873. arXiv:1907.08873.

16.

Sadiq,

Mehmood,

Ullah,

Ahmad,

G.S.

Choi and

B.-W.

On, Aggression detection through deep neural model on Twitter, Future Generation Computer Systems 114 (2021), 120–129, https://www-sciencedirect-com.web.bisu.edu.cn/science/article/pii/S0167739X19330717 . doi:10.1016/j.future.2020.07.050.

17.

Gambäck and

U.K.

Sikdar, Using convolutional neural networks to classify hate-speech, in: Proceedings of the First Workshop on Abusive Language Online, Vancouver, BC, Canada, Association for Computational Linguistics, 2017, pp. 85–90, https://www.aclweb.org/anthology/W17-3013 . doi:10.18653/v1/W17-3013.

18.

S.T.

Aroyehun and

Gelbukh, Aggression detection in social media: Using deep neural networks, data augmentation, and pseudo labeling, in: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), Santa Fe, New Mexico, USA, Association for Computational Linguistics, 2018, pp. 90–97, https://www.aclweb.org/anthology/W18-4411 .

19.

Chatzakou,

Kourtellis,

Blackburn,

E.D.

Cristofaro,

Stringhini and

Vakali, Mean Birds: Detecting Aggression and Bullying on Twitter, 2017, CoRR, http://arxiv.org/abs/1702.06877. arXiv:1702.06877.

20.

Socialbakers, Twitter Staitstics, 2021. https://cutt.ly/Mx3sGgK.

21.

T. project, OSINT, 2021. https://github.com/twintproject/.

22.

M.A.

Al-garadi,

K.D.

Varathan and

S.D.

Ravana, Cybercrime detection in online communications, Comput. Hum. Behav. 63(C) (2016), 433–443. doi:10.1016/j.chb.2016.05.051.

23.

Dani,

Li and

Liu, Sentiment informed cyberbullying detection in social media, in: Machine Learning and Knowledge Discovery in Databases,

Ceci,

Hollmén,

Todorovski,

Vens and

Džeroski, eds, Springer International Publishing, Cham, 2017, pp. 52–67. ISBN 978-3-319-71249-9. doi:10.1007/978-3-319-71249-9_4.