Hate speech detection in Chinese language using deep learning

recent years, the rise of cyberbullying and online sexism has had devastating consequences, with Chinese social media platforms such as Sina Weibo and Zhihu seeing increased incidents of online harassment, leading to severe outcomes like suicide. To combat this, the project aims to develop deep lear...

Full description

Bibliographic Details
Main Author: Lim, Hazel Benin
Format: Final Year Project / Dissertation / Thesis
Published: 2024
Subjects:
Online Access:http://eprints.utar.edu.my/6955/
http://eprints.utar.edu.my/6955/1/fyp_CS_2024_LHB.pdf
Description
Summary:recent years, the rise of cyberbullying and online sexism has had devastating consequences, with Chinese social media platforms such as Sina Weibo and Zhihu seeing increased incidents of online harassment, leading to severe outcomes like suicide. To combat this, the project aims to develop deep learning models that effectively classify sexist content in Chinese social media. Despite extensive research on English-language cyberbullying detection, there is limited focus on Chinese contexts, particularly regarding sexism. This study utilizes the Sina Weibo Sexism Review (SWSR) dataset, evaluating several recurrent neural network (RNN) architectures, including RNN, LSTM, GRU, Bi-LSTM, Bi-GRU, RNN-LSTM, and RNN-GRU. These models were tested on balanced and imbalanced datasets, yielding accuracy rates between 74.2% and 76.8%. Precision, recall, and F1 scores ranged from 0.6818 to 0.7447, indicating strong classification performance. Moreover, incorporating emoji embeddings and English-Chinese translation further improved model accuracy and sensitivity in identifying sexist content. This research provides a significant contribution toward addressing online harassment in Chinese text, offering actionable insights for future cyberbullying detection systems.