Classification Cyber Harassment on Twitter using Multinomial Naïve Bayes

Ria Dhea Layla Nur Karisma

Abstract


Multinomial Naïve Bayes is a classification method in Naïve Bayes Classifier based on Bayes Theorem and multinomial distribution. This method works optimally in the multiclass classification of text data. Furthermore, it calculates the probability of occurrence of each word by multiplying the class prior probability by the likelihood value of the occurrence of each word in each class. The phenomenon of Cyber Harassment is defined as the behavior of utilizing technology to harm or humiliate people, which has four types of behavior, namely Physical Threats, Purposeful Embarrassment, Racist, and Sexual Harassment. The number of Cyber Harassment cases always increases every year even though the government has made policies to deal with Cyber Harassment cases. The study aims to classify results accurately regarding the types of Cyber Harassment on Twitter using the Multinomial Naïve Bayes method. The classification results obtained are 20 tweets classified as Physical Threats, 10 tweets classified as Purposeful Embarrassment, 25 tweets classified as Racist, and 22 tweets classified as Sexual Harassment. The accuracy of classification of types of Cyber Harassment on Twitter social media using Multinomial Naïve Bayes is 77%, and the results of the model performance test with K-fold cross-validation is 76.21%, showing that the Multinomial Naïve Bayes method can classify the types of Cyber Harassment on Twitter social media is well effective.

Full Text:

PDF

References


S. Garcia, J. Luengo and F. Herrera, Data Preprocessing in Data Mining, Spain: Springer International Publishing Switzerland, 2015.

J. D. M. Rennie, L. S, J. Teevan and D. R. Karger, "Tackling the Poor Assumptions of Naive Bayes Text Classifier," Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), p. 8, 2003.

R. Prieto, S. Cresci, C. L. M. and B. S. R, "Crime and its fear in social media," Palgrave Commun, vol. 6, p. 57, 2020.

A. Akhter, U. K. Acharjee and M. M. A. Polash, "Cyber Bullying Detection and Classification using Multinomial Naïve Bayes and Fuzzy Logic," I.J. Mathematical Sciences and Computing, vol. 4, pp. 1-12, 2019.

J. Pardede, Y. Miftahuddin and W. Kahar, "Deteksi Komentar Cyberbullying Pada Media Sosial Berbahasa Inggris Menggunakan Naive Bayes Classification," Jurnal Informatika, vol. 7, pp. 46-54, 2020.

R. R. E. Akbar, R. N. Shofa and M. I. Paripurna, "The Implementation of Naïve Bayes Algorithm for Classifying Tweets Containing Hate Speech with Political Motive," pp. 144-148, 2019.

C. Wolford-Clevenger, H. Zapor, H. Brasfield, J. Febres, J. Elmquist, M. Brem, R. C. Shorey and G. L. Stuart, "An Examination of the Partner Cyber Abuse Questionnaire in a College Student Sample," American Psychological Association, pp. 1-23, 2015.

U. Kango, "Bentuk-Bentuk Kekerasan yang Dialami Perempuan," Jurnal Legalitas, vol. 2, pp. 13-20, 2009.

U. Karimah and T. Suyanto, "Model Pelayanan Sosial Anak Korban Kekerasan Fisik di Lembaga Perlindungan Anak Jawa Timur," Kajian Moral dan Kewarganegaraan, vol. 7, pp. 1053-1067, 2019.

S. R. A. Putri, "Fenomena Verbal Bullying di Masyarakat Pedawang," 2020.

G. Gorrell, M. E. Bakir, I. Roberts, M. A. Greenwood and K. Bontcheva, "Which politicians receive abuse? Four factors illuminated in the UK general election 2019," EPJ Data Science, pp. 1-20, 2020.

V. Ibrahim, J. A. Bakar, N. H. Harun and A. F. Abdulateef, "A Word Cloud based on Hate Speech in an Online Social Media Environtment," Baghdad Science Journal, vol. 18, pp. 937-946, 2021.

T. A. M. Putri, U. Enri and B. N. Sari, "Analisis Algoritma Naive Bayes Classifier untuk Klasifikasi Tweet Pelecehan Seksual dengan #MeToo," Indonesian Journal on Computer and Technology, vol. 5, pp. 126-135, 2020.

C. Zhai and S. Massung, Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining, 2016.

T. E. G. Reiten, Classification with Multiple Classes using Naive Bayes and Text Generation with a Small Data Set using a Recurrent Neutral Network, Norway: University of Agder, 2017.

C. D. Manning, P. Raghavan and H. Schutze, Introduction to Information Retrieval, Cambridge University Press, 2008.

J. Kamel, Graphical Models for Classification and Time Series, France: Aix-Marseille University, 2017.

A. N. Roifa, Text Mining dengan Metode Naive Bayes Classifier untuk Mengklasifikan Berita Berdasarkan Konten, Surabaya: Institur Teknologi Sepuluh November, 2018.




DOI: https://doi.org/10.18860/mat.v17i2.29866

Refbacks

  • There are currently no refbacks.




Copyright (c) 2025 Ria Dhea Layla Nur Karisma

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

The journal is indexed by :

Dimensions Sinta CrossRef GoogleScholar
Index Copernicus Moraref Portal Garuda

 

_______________________________________________________________________________________________________________

Editorial Office:
Informatics Engineering Department
Faculty of Science and Technology
Universitas Islam Negeri Maulana Malik Ibrahim Malang
Jalan Gajayana 50 Malang, Jawa Timur, Indonesia 65144
Email: matics@uin-malang.ac.id
_______________________________________________________________________________________________________________

Creative Commons License
This work is licensed under a CC-BY-SA 4.0.
© All rights reserved 2015. MATICS , ISSN : 1978-161X | e-ISSN :  2477-2550