P Perbandingan Kinerja IndoBERT dan IndoRoBERTa dengan Penerapan SMOTE dalam Deteksi Ujaran Kebencian Berbahasa Indonesia

Muhammad Mutawakkil  Alallah; Indra  Rosyidah

doi:10.62048/qjms.v3i2.167

Authors

Muhammad Mutawakkil Alallah Universitas Islam Negeri Maulana Malik Ibrahim, Malang, Indonesia
Indra Rosyidah Universitas Nurul Jadid, Probolinggo, Indonesia

DOI:

https://doi.org/10.62048/qjms.v3i2.167

Keywords:

hate speech, NLP, Transformer, IndoBERT, IndoRoBerTa

Abstract

The rapid growth of social media in Indonesia has increased digital interaction while also giving rise to hate speech issues that affect communication quality and social stability. This study aims to compare the performance of two Transformer-based models, IndoBERT and IndoRoBERTa, in Indonesian-language hate speech classification and to evaluate the effect of the SMOTE data balancing technique. The dataset consisted of Indonesian-language Twitter data that underwent preprocessing and was divided using an 80:20 stratified train-test split. Model training was conducted through fine-tuning, while evaluation employed accuracy, precision, recall, and F1-score metrics. The results show that IndoRoBERTa outperformed IndoBERT across all evaluation metrics and was more effective in reducing classification errors. The application of SMOTE also improved the models' ability to detect minority classes, particularly in terms of recall. These findings indicate that the combination of Transformer-based models and data balancing techniques is effective in improving both classification accuracy and class balance in hate speech detection. Furthermore, the results suggest that the combination of IndoRoBERTa and SMOTE has strong potential to support the development of more accurate and adaptive automated content moderation systems for Indonesian-language social media platforms.

References

Alamsyah, A., & Sagama, Y. (2024). Empowering Indonesian internet users?: An approach to counter online toxicity and enhance digital well-being. Intelligent Systems with Applications, 22(August 2023), 200394. https://doi.org/10.1016/j.iswa.2024.200394

Alkomah, F., & Ma, X. (2022). A Literature Review of Textual Hate Speech Detection Methods and Datasets. Information, 1–22. https://doi.org/https://doi.org/10.3390/info13060273

Amalia, F. S., & Suyanto, Y. (2024). Offensive Language and Hate Speech Detection Using Bert Model. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 18(4). https://doi.org/https://doi.org/10.22146/ijccs.99841

Bao, R., & Gu, B. (2022). An Accelerated Doubly Stochastic Gradient Method with Faster Explicit Model Identification. In Proceedings of the 31st ACM International Conference on Information and Knowledge Management (Vol. 1, Nomor 1). Association for Computing Machinery. https://doi.org/10.1145/3511808.3557234

Fetahi, E., Susuri, A., Hamiti, M., Kastrati, Z., Canhasi, E., & Misini, A. (2025). Enhancing social media hate speech detection in low ? resource languages using transformers and explainable AI. Social Network Analysis and Mining, 15(1), 1–30. https://doi.org/10.1007/s13278-025-01497-w

Ghosh, K. (2025). Hate speech detection in low-resourced Indian languages?: An analysis of transformer-based monolingual and multilingual models with cross-lingual experiments. Natural Language Processing, 393–414. https://doi.org/10.1017/nlp.2024.28

Hakimi, M., Kohistani, A. J., Azimy, A. S., & Ardi, I. M. (2025). The Influence of Emerging Technologies on Communication Practices in the Digital Age. Jurnal Ilmiah Dinamika Sosial, 9(1), 136–153. https://doi.org/https://doi.org/10.38043/jids.v9i1.6500

Idris, U., Salihu, S., Abdulalim, N., Ali, S., Shawulu, J. C., & Adam, A. (2026). Machine Learning for Hate Text Speech Detection?: A Comprehensive Review of Techniques , Dataset and Challenges. Asian Journal of Research in Computer Science Volume, 19(2), 204–218. https://doi.org/10.9734/ajrcos/2026/v19i2832

Imaduddin, H., Kusumaningtias, L. A., & A, F. Y. (2023). Application of LSTM and GloVe Word Embedding for Hate Speech Detection in Indonesian Twitter Data. Ingénierie des Systèmes d ’ Information, 28(4), 1107–1112. https://doi.org/https://doi.org/10.18280/isi.280430

Koto, F., & Baldwin, T. (2020). IndoLEM and IndoBERT?: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. Proceedings ofthe 28th International Conference on Computational Linguistics, 757–770. https://doi.org/10.18653/v1/2020.coling-main.66

Kovács, G., Alonso, P., & Saini, R. (2021). Challenges of Hate Speech Detection in Social Media. SN Computer Science, 2(2), 1–15. https://doi.org/10.1007/s42979-021-00457-3

Nurindah, A. A., Hasanati, N., & Aini, Q. (2025). Bibliometrik Hate Speech?: Tren Metode Penelitian dan Domain Implementasi. JUSIFOR: Jurnal Sistem Informasi dan Informatika, 4(2), 270–278. https://doi.org/https://doi.org/10.70609/jusifor.v4i2.8652

Okky, M., & Budi, I. (2023). Hate speech and abusive language detection in Indonesian social media?: Progress and challenges. Heliyon, 9(8), e18647. https://doi.org/10.1016/j.heliyon.2023.e18647

Pamungkas, E. W., & Purworini, D. (2025). Enhancing Hate Speech Detection in Low- Resource Code-Mixed Indonesian Tweets via GPT-Based Data Augmentation. Engineering, Technology & Applied Science Research, 15(6), 30649–30656. https://doi.org/https://etasr.com/index.php/ETASR/article/view/14342/6045

Pananookooln, C., Akaranee, J., & Silpasuwancha, C. (2023). Comparing Selective Masking Methods for Depression Detection in Social Media. Computational Linguistics, February. https://doi.org/10.1162/coli a 00479

Przyby?a, P., & Soto, A. J. (2021). When classification accuracy is not enough?: Explaining news credibility assessment. Information Processing and Management, 58(5), 102653. https://doi.org/10.1016/j.ipm.2021.102653

Purnomo, T. D., & Sutopo, J. (2024). Comparison of Pre-Trained BERT-based Transformer Models fo Regional. Internasional Journal Science and Technology, 3(3), 11–21. https://doi.org/https://doi.org/10.56127/ijst.v3i3.1739

Ramos, G., Batista, F., Ribeiro, R., Fialho, P., Moro, S., Fonseca, A., Guerra, R., Carvalho, P., Marques, C., & Silva, C. (2024). A comprehensive review on automatic hate speech detection in the age of the transformer. Social Network Analysis and Mining, 14(1), 1–25. https://doi.org/10.1007/s13278-024-01361-3

Rivadeneira, R. (2025). applied sciences Emotional Tone Detection in Hate Speech Using Machine Learning and NLP?: Methods , Challenges , and Future Directions — A Systematic Review. Applied Sciences. https://doi.org/https://doi.org/10.3390/app152312686

Sarkar, D., Zampieri, M., Ranasinghe, T., & Ororbia, A. (2021). fBERT?: A Neural Transformer for Identifying Offensive Content. Antologi ACL, 1792–1798. https://doi.org/10.18653/v1/2021.findings-emnlp.154

Selvaraj, P., Nc, G., Kumar, P., & Khapra, M. (2022). OpenHands?: Making Sign Language Recognition Accessible with Pose-based Pretrained Models across Languages. Antologi ACL, 1, 2114–2133. https://doi.org/0.18653/v1/2022.acl-long.150

Shoeb, A. A., & Melo, G. De. (2021). Assessing Emoji Use in Modern Text Processing Tools. ACL Antology. https://doi.org/10.18653/v1/2021.acl-long.110

Suciati. (2024). A bibliometrics analysis of interpersonal communication in social media. Cogent Social Sciences, 1886. https://doi.org/10.1080/23311886.2024.2424472

Tita, T. (2021). Cross-lingual Hate Speech Detection using Transformer Models. arXiv. https://doi.org/https://doi.org/10.48550/arXiv.2111.00981

Tsugawa, S., & Watabe, K. (2023). Identifying Influential Brokers on Social Media from Social Network Structure. Proceedings of the Seventeenth International AAAI Conference on Web and Social Media (ICWSM 2023), Icwsm. https://doi.org/https://doi.org/10.1609/icwsm.v17i1.22193

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., & Kaiser, ?. (2023). Attention Is All You Need. arxiv, Nips. https://doi.org/https://doi.org/10.48550/arXiv.1706.03762

Yoon, M., Gervet, T., Shi, B., Niu, S., He, Q., & Yang, J. (2021). Performance-Adaptive Sampling Strategy Towards Fast and Accurate Graph Neural Networks. Research Track Paper, 2046–2056. https://doi.org/https://doi.org/10.1145/3447548.34672

Zhang, Y., & Chen, L. (2021). A Study on Forecasting the Default Risk of Bond Based on XGboost Algorithm and Over-Sampling Method. Theoretical Economics Letters, 258–267. https://doi.org/10.4236/tel.2021.112019