Please use this identifier to cite or link to this item:
Full metadata record
DC FieldValueLanguage
dc.contributor.authorOmran, Thuraya Mohamed Maki-
dc.descriptionThis thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University Londonen_US
dc.description.abstractSentiment analysis is a crucial natural language processing (NLP) task to analyze the user’s emotions and opinions towards entities such as events, services, or products. Arabic NLP faces numerous challenges, some of which include: (1) the scarcity of resources, especially in modern standard Arabic and Arabic dialects, particularly the Bahraini one; (2) the lack of multilingual deep learning models; and (3) insufficient transfer learning studies on Arabic dialects in general and Bahraini dialects specifically. This research aims to create a balanced dataset of Bahraini dialects that covers product reviews by translating English Amazon product reviews to modern standard Arabic, which were then converted to Bahraini dialects. Another aim of this research is to provide a multilingual deep learning long short-term memory (LSTM) model to analyze the parallel dataset of English, modern standard Arabic, and Bahraini dialects, which differ in linguistic properties. Many experiments were conducted using train-validate-test split and k-fold cross-validation to evaluate the model performance using accuracy, F1 score, and AUC metrics. The average accuracy of the model on all datasets ranged from 96.72% to 97.04% and 97.91% to 97.93% in the F1 score, while in AUC was 98.46% to 98.7% when utilizing an augmentation technique. The LSTM model was incorporated in a stacking ensemble learning process that includes other LSTM architectures as base learners and a decision tree (DT) as a meta-learner. Interestingly, promising results were obtained, such as 99.52%, 99.25%, and 98.52% of mean accuracy for English, MSA, and BDs datasets. Moreover, the LSTM model was utilized as a pre-trained model in the transfer learning process to exploit the knowledge gained from analyzing the product reviews in Bahraini dialects to perform another sentiment analysis task on a small dataset of movie comments in the same dialects. The pre-trained model performance was 96.97% accuracy, 96.65% F1 score, and 97.94% AUC.en_US
dc.publisherBrunel University Londonen_US
dc.subjectNatural language processingen_US
dc.subjectResource scarcityen_US
dc.subjectParallel dataseten_US
dc.subjectTransfer learningen_US
dc.subjectLSTM deep learning modelen_US
dc.titleMultilingual sentiment analysis of Arabic, Bahraini dialects and Englishen_US
Appears in Collections:Computer Science
Dept of Computer Science Theses

Files in This Item:
File Description SizeFormat 
FulltextThesis.pdf5.26 MBAdobe PDFView/Open

Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.