Preliminary Evaluation of Gaussian Naive Bayes for Multi-Label Hate Speech and Abusive Language Detection on Indonesian Twitter
DOI:
https://doi.org/10.62504/jimr532Keywords:
Gaussian Naïve Bayes, Hate speech, Cyberbulling, TF-IDF, BERTAbstract
Automatic detection of hate speech and abusive language is crucial for combating online toxicity. This study explores Gaussian Naive Bayes for multi-label classification of hate speech on Indonesian Twitter, including target, category, and level. We combined TF-IDF features with contextual BERT embeddings. The model achieved balanced performance for general hate speech and good non-abusive language detection. However, it exhibited limitations with imbalanced data and specific hate speech types. The classifier consistently favored the majority class (non-hateful/non-abusive) across labels, particularly struggling with HS_Gender, HS_Physical, etc. This suggests difficulty detecting less frequent but potentially severe hate speech, likely due to limited training data. Overall accuracy and F1-scores confirm that while Gaussian Naive Bayes is efficient, it lacks robustness for nuanced multi-label classification with imbalanced datasets. This necessitates exploring alternative approaches for effectively detecting specific and less frequent hate speech.
Downloads
References
Badjatiya, P., Gupta, S., Gupta, M., & Varma, V. (2017). Deep Learning for Hate Speech Detection in Tweets. Proceedings of the 26th International Conference on World Wide Web Companion (WWW).
Chen, Z., Zhou, Y., & Zou, Y. (2018). Integrating Sentiment Features and Word Embeddings for Sentiment Analysis. Journal of Information Science and Engineering, 34(5), 1237–1250.
Davidson, T., Warmsley, D., Macy, M., & Weber, I. (2017). Automated Hate Speech Detection and the Problem of Offensive Language. Proceedings of the 11th International AAAI Conference on Web and Social Media (ICWSM).
Ibrohim, M. O., & Budi, I. (2019). Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter. ALW3: 3rd Workshop on Abusive Language Online, 46–57. https://www.aclweb.org/anthology/W19-3506.pdf
Wang, B., Peng, T., Yang, J., & Sun, H. (2017). Stacking-Based Ensemble Learning for Sentiment Classification of Chinese Microblogs. Neurocomputing, 214, 708–718.
Waseem, Z., & Hovy, D. (2016). Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. Proceedings of the NAACL Student Research Workshop.
Xu, W., Liu, X., & Gong, Y. (2012). Document Clustering Based on Non-negative Matrix Factorization. Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR).
Zhang, Z., & Luo, L. (2019). Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter. Semantic Web, 10(5), 925–945.
Downloads
Published
Issue
Section
License
Copyright (c) 2023 Tri Pratiwi Handayani, Wahyudin Hasyim, Nursetia (Author)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.