Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/29263
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Jafri, FA | - |
dc.contributor.author | Rauniyar, K | - |
dc.contributor.author | Thapa, S | - |
dc.contributor.author | Siddiqui, MA | - |
dc.contributor.author | Khushi, M | - |
dc.contributor.author | Naseem, U | - |
dc.date.accessioned | 2024-06-23T21:54:32Z | - |
dc.date.available | 2024-06-23T21:54:32Z | - |
dc.date.issued | 2024-05-16 | - |
dc.identifier | ORCiD: Farhan Ahmad Jafri https://orcid.org/0000-0003-2494-2548 | - |
dc.identifier | ORCiD: Kritesh Rauniyar https://orcid.org/0000-0001-6806-6688 | - |
dc.identifier | ORCiD: Surendrabikram Thapa https://orcid.org/0000-0003-4119-8239 | - |
dc.identifier | ORCiD: Mohammad Aman Siddiqui https://orcid.org/0000-0003-2191-9721 | - |
dc.identifier | ORCiD: Matloob Khushi https://orcid.org/0000-0001-7792-2327 | - |
dc.identifier | ORCiD: Usman Naseem https://orcid.org/0000-0003-0191-7171 | - |
dc.identifier.citation | Jafri, F.A. et al. (2024) 'CHUNAV: Analyzing Hindi Hate Speech and Targeted Groups in Indian Election Discourse', ACM Transactions on Asian and Low-Resource Language Information Processing, 0 (ahead of print), pp. 1 - 32. doi: 10.1145/3665245. | en_US |
dc.identifier.issn | 2375-4699 | - |
dc.identifier.uri | https://bura.brunel.ac.uk/handle/2438/29263 | - |
dc.description.abstract | In the ever-evolving landscape of online discourse and political dialogue, the rise of hate speech poses a signiicant challenge to maintaining a respectful and inclusive digital environment. he context becomes particularly complex when considering the Hindi language—a low-resource language with limited available data. To address this pressing concern, we introduce the CHUNAV dataset—a collection of 11,457 Hindi tweets gathered during assembly elections in various states. CHUNAV is purpose-built for hate speech categorization and the identiication of target groups. he dataset is a valuable resource for exploring hate speech within the distinctive socio-political context of Indian elections. he tweets within CHUNAV have been meticulously categorized into “Hate” and “Non-Hate” labels, and further subdivided to pinpoint the speciic targets of hate speech, including “Individual”, “Organization”, and “Community” labels (as shown in Figure 1). Furthermore, this paper presents multiple benchmark models for hate speech detection, along with an innovative ensemble and oversampling-based method. he paper also delves into the results of topic modeling, all aimed at efectively addressing hate speech and target identiication in the Hindi language. his contribution seeks to advance the ield of hate speech analysis and foster a safer and more inclusive online space within the distinctive realm of Indian Assembly Elections. | en_US |
dc.description.sponsorship | MKis supported by UKRI NERC grant NE/X000192/12. | en_US |
dc.format.extent | 1 - 32 | - |
dc.format.medium | Print-Electronic | - |
dc.language | English | - |
dc.language.iso | en_US | en_US |
dc.publisher | Association for Computing Machinery (ACM) | en_US |
dc.rights | © 2024 Copyright held by the owner/author(s). Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proit or commercial advantage and that copies bear this notice and the full citation on the irst page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). The definitive Version of Record was published in ACM Transactions on Asian and Low-Resource Language Information Processing, https://doi.org/10.1145/3665245 (see: https://www.acm.org/publications/policies/copyright-policy). | - |
dc.rights.uri | https://www.acm.org/publications/policies/copyright-policy | - |
dc.subject | hate speech | en_US |
dc.subject | natural language processing | en_US |
dc.subject | Indian election | en_US |
dc.subject | topic modeling | en_US |
dc.subject | ensemble methods | en_US |
dc.title | CHUNAV: Analyzing Hindi Hate Speech and Targeted Groups in Indian Election Discourse | en_US |
dc.type | Article | en_US |
dc.date.dateAccepted | 2024-05-09 | - |
dc.identifier.doi | https://doi.org/10.1145/3665245 | - |
dc.relation.isPartOf | ACM Transactions on Asian and Low-Resource Language Information Processing | - |
pubs.issue | ahead of print | - |
pubs.publication-status | Published online | - |
pubs.volume | 0 | - |
dc.identifier.eissn | 2375-4702 | - |
dc.rights.holder | The owner/author(s) | - |
Appears in Collections: | Dept of Computer Science Research Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FullText.pdf | © 2024 Copyright held by the owner/author(s). Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proit or commercial advantage and that copies bear this notice and the full citation on the irst page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). The definitive Version of Record was published in ACM Transactions on Asian and Low-Resource Language Information Processing, https://doi.org/10.1145/3665245 (see: https://www.acm.org/publications/policies/copyright-policy). | 2.38 MB | Adobe PDF | View/Open |
Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.