Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/29262
Full metadata record
DC FieldValueLanguage
dc.contributor.authorNaseem, U-
dc.contributor.authorKim, J-
dc.contributor.authorKhush, M-
dc.contributor.authorDunn, AG-
dc.date.accessioned2024-06-23T21:32:07Z-
dc.date.available2024-06-23T21:32:07Z-
dc.date.issued2024-03-04-
dc.identifierORCiD: Usman Naseem https://orcid.org/0000-0003-0191-7171-
dc.identifierORCiD: Jinmaan Kim https://orcid.org/0000-0001-5960-1060-
dc.identifierORCiD: Matloob Khushi https://orcid.org/0000-0001-7792-2327-
dc.identifierORCiD: Adam G. Dunn https://orcid.org/0000-0002-1720-8209-
dc.identifier.citationNaseem, U. et al. (2024) 'A Linguistic Grounding-Infused Contrastive Learning Approach for Health Mention Classification on Social Media', WSDM 2024 - Proceedings of the 17th ACM International Conference on Web Search and Data Mining, Merida, Mexico, 4-8 Marchpp. 529 - 537. doi: 10.1145/3616855.3635763.en_US
dc.identifier.isbn979-8-4007-0371-3-
dc.identifier.urihttps://bura.brunel.ac.uk/handle/2438/29262-
dc.description.abstractSocial media users use disease and symptoms words in different ways, including describing their personal health experiences figuratively or in other general discussions. The health mention classification (HMC) task aims to separate how people use terms, which is important in public health applications. Existing HMC studies address this problem using pretrained language models (PLMs). However, the remaining gaps in the area include the need for linguistic grounding, the requirement for large volumes of labelled data, and that solutions are often only tested on Twitter or Reddit, which provides limited evidence of the transportability of models. To address these gaps, we propose a novel method that uses a transformer-based PLM to obtain a contextual representation of target (disease or symptom) terms coupled with a contrastive loss to establish a larger gap between target terms' literal and figurative uses using linguistic theories. We introduce the use of a simple and effective approach for harvesting candidate instances from the broad corpus and generalising the proposed method using self-Training to address the label scarcity challenge. Our experiments on publicly available health-mention datasets from Twitter (HMC2019) and Reddit (RHMD) demonstrate that our method outperforms the state-of-The-Art HMC methods on both datasets for the HMC task. We further analyse the transferability and generalisability of our method and conclude with a discussion on the empirical and ethical considerations of our study.en_US
dc.format.extent529 - 537-
dc.format.mediumElectronic-
dc.language.isoen_USen_US
dc.rights© 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee (see: https://www.acm.org/publications/policies/copyright-policy) .Request permissions from permissions@acm.org. The definitive Version of Record was published in WSDM’24, March 4–8, 2024, Merida, Mexico, https://doi.org/10.1145/3616855.3635763.-
dc.rights.urihttps://www.acm.org/publications/policies/copyright-policy-
dc.subjecthealth mention classificationen_US
dc.subjectpublic health surveillanceen_US
dc.subjectcontrastive learningen_US
dc.subjectsocial mediaen_US
dc.titleA Linguistic Grounding-Infused Contrastive Learning Approach for Health Mention Classification on Social Mediaen_US
dc.typeConference Paperen_US
dc.date.dateAccepted2023-10-19-
dc.identifier.doihttps://doi.org/10.1145/3616855.3635763-
dc.relation.isPartOfWSDM 2024 - Proceedings of the 17th ACM International Conference on Web Search and Data Mining-
pubs.publication-statusPublished-
dc.rights.holderThe owner/author(s)-
Appears in Collections:Dept of Computer Science Research Papers

Files in This Item:
File Description SizeFormat 
FullText.pdf© 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee (see: https://www.acm.org/publications/policies/copyright-policy) .Request permissions from permissions@acm.org. The definitive Version of Record was published in WSDM’24, March 4–8, 2024, Merida, Mexico, https://doi.org/10.1145/3616855.3635763.1.38 MBAdobe PDFView/Open


Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.