Profanity and hate speech detection
Profanity, often found in today’s online social media, has been used to detect online hate speech. The aims of this study were to investigate the profanity usage on Twitter by different groups of users, and to quantify the effectiveness of using profanity in detecting hate speech. Tweets from three...
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Tamkang University
2020
|
| Subjects: | |
| Online Access: | http://eprints.sunway.edu.my/1534/ http://eprints.sunway.edu.my/1534/1/Teh%20Phoey%20Lee%20Preprint%20-%20Profanity%20and%20Hate%20Speech%20Detection.pdfx |
| _version_ | 1848802080243843072 |
|---|---|
| author | Teh, Phoey Lee * Cheng, Chi-Bin |
| author_facet | Teh, Phoey Lee * Cheng, Chi-Bin |
| author_sort | Teh, Phoey Lee * |
| building | SU Institutional Repository |
| collection | Online Access |
| description | Profanity, often found in today’s online social media, has been used to detect online hate speech. The aims of this study were to investigate the profanity usage on Twitter by different groups of users, and to quantify the effectiveness of using profanity in detecting hate speech. Tweets from three English-speaking countries, Australia, Malaysia, and the United States, were collected for data
analysis. Statistical hypothesis tests were performed to justify the difference of profanity usage among the three countries, and a probability estimation procedure was formulated based on Bayes theorem to quantify the effectiveness of profanity-based methods in hate speech detection. Three deep learning methods, long short-term memory (LSTM), bidirectional LSTM (BLSTM), and
bidirectional encoder representations from transformers (BERT) are further used to evaluate the effect of profanity screening on building classification model. Our
experimental results show that the effectiveness of using profanity in detecting hate speech is questionable. Nevertheless, the results also show that for Australia
tweets, where profanity is more associated with hatred, profanity-based methods in hate speech detection could be effective and profanity screening can address the class imbalance issue in hate speech detection. This is evidenced by the performances of using deep learning methods on the profanity screened data of Australia data, which achieved a classification f1-score greater than 0.84. |
| first_indexed | 2025-11-14T21:17:39Z |
| format | Article |
| id | sunway-1534 |
| institution | Sunway University |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-14T21:17:39Z |
| publishDate | 2020 |
| publisher | Tamkang University |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | sunway-15342021-07-30T08:18:19Z http://eprints.sunway.edu.my/1534/ Profanity and hate speech detection Teh, Phoey Lee * Cheng, Chi-Bin HM Sociology QA75 Electronic computers. Computer science Profanity, often found in today’s online social media, has been used to detect online hate speech. The aims of this study were to investigate the profanity usage on Twitter by different groups of users, and to quantify the effectiveness of using profanity in detecting hate speech. Tweets from three English-speaking countries, Australia, Malaysia, and the United States, were collected for data analysis. Statistical hypothesis tests were performed to justify the difference of profanity usage among the three countries, and a probability estimation procedure was formulated based on Bayes theorem to quantify the effectiveness of profanity-based methods in hate speech detection. Three deep learning methods, long short-term memory (LSTM), bidirectional LSTM (BLSTM), and bidirectional encoder representations from transformers (BERT) are further used to evaluate the effect of profanity screening on building classification model. Our experimental results show that the effectiveness of using profanity in detecting hate speech is questionable. Nevertheless, the results also show that for Australia tweets, where profanity is more associated with hatred, profanity-based methods in hate speech detection could be effective and profanity screening can address the class imbalance issue in hate speech detection. This is evidenced by the performances of using deep learning methods on the profanity screened data of Australia data, which achieved a classification f1-score greater than 0.84. Tamkang University 2020 Article PeerReviewed text en cc_by_nc_4 http://eprints.sunway.edu.my/1534/1/Teh%20Phoey%20Lee%20Preprint%20-%20Profanity%20and%20Hate%20Speech%20Detection.pdfx Teh, Phoey Lee * and Cheng, Chi-Bin (2020) Profanity and hate speech detection. International Journal of Information and Management Sciences, 31 (3). pp. 227-246. ISSN 1017-1819 https://www.airitilibrary.com/Publication/alPublicationJournal?PublicationID=10171819&IssueID=202010270001 |
| spellingShingle | HM Sociology QA75 Electronic computers. Computer science Teh, Phoey Lee * Cheng, Chi-Bin Profanity and hate speech detection |
| title | Profanity and hate speech detection |
| title_full | Profanity and hate speech detection |
| title_fullStr | Profanity and hate speech detection |
| title_full_unstemmed | Profanity and hate speech detection |
| title_short | Profanity and hate speech detection |
| title_sort | profanity and hate speech detection |
| topic | HM Sociology QA75 Electronic computers. Computer science |
| url | http://eprints.sunway.edu.my/1534/ http://eprints.sunway.edu.my/1534/ http://eprints.sunway.edu.my/1534/1/Teh%20Phoey%20Lee%20Preprint%20-%20Profanity%20and%20Hate%20Speech%20Detection.pdfx |