Model for Determining the Psycho-Emotional State of a Person Based on Multimodal Data Analysis

Shakhovska, N; Zherebetskyi, O; Lupenko, S

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/28740

Full metadata record

DC Field	Value	Language
dc.contributor.author	Shakhovska, N	-
dc.contributor.author	Zherebetskyi, O	-
dc.contributor.author	Lupenko, S	-
dc.date.accessioned	2024-04-10T14:14:07Z	-
dc.date.available	2024-04-10T14:14:07Z	-
dc.date.issued	2024-02-26	-
dc.identifier	ORCiD: Nataliya Shakhovska https://orcid.org/0000-0002-6875-8534	-
dc.identifier	ORCiD: Serhii Lupenko https://orcid.org/0000-0002-6559-0721	-
dc.identifier	1920	-
dc.identifier.citation	Shakhovska, N., Zherebetskyi, O. and Lupenko, S. (2024) 'Model for Determining the Psycho-Emotional State of a Person Based on Multimodal Data Analysis', Applied Sciences,,14 (5),1920, pp. 1 - 24. doi: 10.3390/app14051920.	en_US
dc.identifier.uri	https://bura.brunel.ac.uk/handle/2438/28740	-
dc.description	Data Availability Statement: The data supporting this study’s findings are openly available in https://doi.org/10.6084/m9.figshare.23596362.v1 (accessed on 1 September 2022). Datasets are used for different models’ performance evaluation, namely: FER: fer2013: https://www.kaggle.com/deadskull7/fer2013 (accessed on 1 September 2022) and CK + 48 five emotions: https://www.kaggle.com/gauravsharma99/ck48-5-emotions (accessed on 1 September 2022); SER: RAVDESS Emotional speech audio: https://www.kaggle.com/uwrfkaggler/ravdess-emotional-speech-audio (accessed on 1 September 2022); TER: Text-Emotion-detection: https://www.kaggle.com/dataset/f10c38f8f356a43b344ca82476b6b32b5d31b99af19276ba1f7846004c0851f2 (accessed on 1 September 2022); Datasets from the Internet inside the project: https://drive.google.com/drive/folders/1ZV3ceCjNND7xcUxbsJb57aitTpUbcYa9?usp=sharing (accessed on 1 September 2022); Videos for tests from YouTube: (1) Biden Delivers Remarks On Inflation_NBC News—https://www.youtube.com/watch?v=ckCOF719atE (accessed on 1 September 2022); (2) Boris Johnson_Ukraine will win war and ‘be free’—https://www.youtube.com/watch?v=WPM8Pvgkz7Y (accessed on 1 September 2022); (3) Father’s final words to his dying son!—https://www.youtube.com/watch?v=C3hABRHmQoo (accessed on 1 September 2022); (4) Minecraft Warden Update is a NIGHTMARE!—https://www.youtube.com/watch?v=2osdz9Z5JKY (accessed on 1 September 2022). Video for Live Test: https://drive.google.com/drive/folders/1wAR2CdlGIEtOSjKv7T9e-gQhBHIAiLLM?usp=sharing (accessed on 1 September 2022).	en_US
dc.description.abstract	The paper aims to develop an information system for human emotion recognition in streaming data obtained from a PC or smartphone camera, using different methods of modality merging (image, sound and text). The objects of research are the facial expressions, the emotional color of the tone of a conversation and the text transmitted by a person. The paper proposes different neural network structures for emotion recognition based on unimodal flows and models for the margin of the multimodal data. The analysis determined that the best classification accuracy is obtained for systems with data fusion after processing each channel separately and obtaining individual characteristics. The final analysis of the model based on data from a camera and microphone or recording or broadcast of the screen, which were received in the “live” mode, gave a clear understanding that the quality of the obtained results is highly dependent on the quality of the data preparation and labeling. This is directly related to the fact that the data on which the neural network is trained is highly qualified. The neural network with combined data on the penultimate layer allows a psycho-emotional state recognition accuracy of 0.90 to be obtained. The spatial distribution of emotion analysis was also analyzed for each data modality. The model with late fusion of multimodal data demonstrated the best recognition accuracy.	en_US
dc.description.sponsorship	The National Research Foundation of Ukraine funded this research under project number 2021.01/0103 and British academy fellowship number RaR\100727.	en_US
dc.format.extent	1 - 24	-
dc.format.medium	Electronic	-
dc.language.iso	en_US	en_US
dc.publisher	MDPI	en_US
dc.rights	Copyright © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).	-
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	-
dc.subject	multimodal data	en_US
dc.subject	late fusion	en_US
dc.subject	convolution neural network	en_US
dc.subject	emotional state	en_US
dc.subject	multi-modal emotion recognition	en_US
dc.title	Model for Determining the Psycho-Emotional State of a Person Based on Multimodal Data Analysis	en_US
dc.type	Article	en_US
pubs.issue	5	-
pubs.volume	14	-
dc.identifier.eissn	2076-3417	-
dc.rights.license	https://creativecommons.org/licenses/by/4.0/legalcode.en	-
dc.rights.license	The authors	-
Appears in Collections:	Dept of Civil and Environmental Engineering Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf	Copyright © 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).	9.17 MB	Adobe PDF	View/Open

Show simple item record

This item is licensed under a Creative Commons License