Profanity and hate speech detection

Profanity, often found in today’s online social media, has been used to detect online hate speech. The aims of this study were to investigate the profanity usage on Twitter by different groups of users, and to quantify the effectiveness of using profanity in detecting hate speech. Tweets from three...

Full description

Bibliographic Details
Main Authors:	Teh, Phoey Lee *, Cheng, Chi-Bin
Format:	Article
Language:	English
Published:	Tamkang University 2020
Subjects:	HM Sociology QA75 Electronic computers. Computer science
Online Access:	http://eprints.sunway.edu.my/1534/ http://eprints.sunway.edu.my/1534/1/Teh%20Phoey%20Lee%20Preprint%20-%20Profanity%20and%20Hate%20Speech%20Detection.pdfx

_version_	1848802080243843072
author	Teh, Phoey Lee * Cheng, Chi-Bin
author_facet	Teh, Phoey Lee * Cheng, Chi-Bin
author_sort	Teh, Phoey Lee *
building	SU Institutional Repository
collection	Online Access
description	Profanity, often found in today’s online social media, has been used to detect online hate speech. The aims of this study were to investigate the profanity usage on Twitter by different groups of users, and to quantify the effectiveness of using profanity in detecting hate speech. Tweets from three English-speaking countries, Australia, Malaysia, and the United States, were collected for data analysis. Statistical hypothesis tests were performed to justify the difference of profanity usage among the three countries, and a probability estimation procedure was formulated based on Bayes theorem to quantify the effectiveness of profanity-based methods in hate speech detection. Three deep learning methods, long short-term memory (LSTM), bidirectional LSTM (BLSTM), and bidirectional encoder representations from transformers (BERT) are further used to evaluate the effect of profanity screening on building classification model. Our experimental results show that the effectiveness of using profanity in detecting hate speech is questionable. Nevertheless, the results also show that for Australia tweets, where profanity is more associated with hatred, profanity-based methods in hate speech detection could be effective and profanity screening can address the class imbalance issue in hate speech detection. This is evidenced by the performances of using deep learning methods on the profanity screened data of Australia data, which achieved a classification f1-score greater than 0.84.
first_indexed	2025-11-14T21:17:39Z
format	Article
id	sunway-1534
institution	Sunway University
institution_category	Local University
language	English
last_indexed	2025-11-14T21:17:39Z
publishDate	2020
publisher	Tamkang University
recordtype	eprints
repository_type	Digital Repository
spelling	sunway-15342021-07-30T08:18:19Z http://eprints.sunway.edu.my/1534/ Profanity and hate speech detection Teh, Phoey Lee * Cheng, Chi-Bin HM Sociology QA75 Electronic computers. Computer science Profanity, often found in today’s online social media, has been used to detect online hate speech. The aims of this study were to investigate the profanity usage on Twitter by different groups of users, and to quantify the effectiveness of using profanity in detecting hate speech. Tweets from three English-speaking countries, Australia, Malaysia, and the United States, were collected for data analysis. Statistical hypothesis tests were performed to justify the difference of profanity usage among the three countries, and a probability estimation procedure was formulated based on Bayes theorem to quantify the effectiveness of profanity-based methods in hate speech detection. Three deep learning methods, long short-term memory (LSTM), bidirectional LSTM (BLSTM), and bidirectional encoder representations from transformers (BERT) are further used to evaluate the effect of profanity screening on building classification model. Our experimental results show that the effectiveness of using profanity in detecting hate speech is questionable. Nevertheless, the results also show that for Australia tweets, where profanity is more associated with hatred, profanity-based methods in hate speech detection could be effective and profanity screening can address the class imbalance issue in hate speech detection. This is evidenced by the performances of using deep learning methods on the profanity screened data of Australia data, which achieved a classification f1-score greater than 0.84. Tamkang University 2020 Article PeerReviewed text en cc_by_nc_4 http://eprints.sunway.edu.my/1534/1/Teh%20Phoey%20Lee%20Preprint%20-%20Profanity%20and%20Hate%20Speech%20Detection.pdfx Teh, Phoey Lee * and Cheng, Chi-Bin (2020) Profanity and hate speech detection. International Journal of Information and Management Sciences, 31 (3). pp. 227-246. ISSN 1017-1819 https://www.airitilibrary.com/Publication/alPublicationJournal?PublicationID=10171819&IssueID=202010270001
spellingShingle	HM Sociology QA75 Electronic computers. Computer science Teh, Phoey Lee * Cheng, Chi-Bin Profanity and hate speech detection
title	Profanity and hate speech detection
title_full	Profanity and hate speech detection
title_fullStr	Profanity and hate speech detection
title_full_unstemmed	Profanity and hate speech detection
title_short	Profanity and hate speech detection
title_sort	profanity and hate speech detection
topic	HM Sociology QA75 Electronic computers. Computer science
url	http://eprints.sunway.edu.my/1534/ http://eprints.sunway.edu.my/1534/ http://eprints.sunway.edu.my/1534/1/Teh%20Phoey%20Lee%20Preprint%20-%20Profanity%20and%20Hate%20Speech%20Detection.pdfx

Profanity and hate speech detection

Similar Items