A Study on the Impact of Integrating Reinforcement Learning for Channel Prediction and Power Allocation Scheme in MISO-NOMA System

Gaballa, M; Abbod, M; Aldallal, A

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/26495

Title:	A Study on the Impact of Integrating Reinforcement Learning for Channel Prediction and Power Allocation Scheme in MISO-NOMA System
Authors:	Gaballa, M Abbod, M Aldallal, A
Keywords:	RL;Q-learning;MISO-NOMA;KKT conditions
Issue Date:	26-Jan-2023
Publisher:	MDPI
Citation:	Gaballa, M., Abbod, M. and Aldallal, A. (2023) 'A Study on the Impact of Integrating Reinforcement Learning for Channel Prediction and Power Allocation Scheme in MISO-NOMA System', Sensors, 2023, 23 (3), 1383, pp. 1 - 28. doi: 10.3390/s23031383.
Abstract:	Copyright © 2023 by the authors. In this study, the influence of adopting Reinforcement Learning (RL) to predict the channel parameters for user devices in a Power Domain Multi-Input Single-Output Non-Orthogonal Multiple Access (MISO-NOMA) system is inspected. In the channel prediction-based RL approach, the Q-learning algorithm is developed and incorporated into the NOMA system so that the developed Q-model can be employed to predict the channel coefficients for every user device. The purpose of adopting the developed Q-learning procedure is to maximize the received downlink sum-rate and decrease the estimation loss. To satisfy this aim, the developed Q-algorithm is initialized using different channel statistics and then the algorithm is updated based on the interaction with the environment in order to approximate the channel coefficients for each device. The predicted parameters are utilized at the receiver side to recover the desired data. Furthermore, based on maximizing the sum-rate of the examined user devices, the power factors for each user can be deduced analytically to allocate the optimal power factor for every user device in the system. In addition, this work inspects how the channel prediction based on the developed Q-learning model, and the power allocation policy, can both be incorporated for the purpose of multiuser recognition in the examined MISO-NOMA system. Simulation results, based on several performance metrics, have demonstrated that the developed Q-learning algorithm can be a competitive algorithm for channel estimation when compared to different benchmark schemes such as deep learning-based long short-term memory (LSTM), RL based actor-critic algorithm, RL based state-action-reward-state-action (SARSA) algorithm, and standard channel estimation scheme based on minimum mean square error procedure.
Description:	Data Availability Statement: Not applicable.
URI:	https://bura.brunel.ac.uk/handle/2438/26495
DOI:	https://doi.org/10.3390/s23031383
Other Identifiers:	ORCID iDs: Mohamed Gaballa https://orcid.org/0000-0001-9500-7333; Maysam Abbod https://orcid.org/0000-0002-8515-7933; Ammar Aldallal https://orcid.org/0000-0001-7811-8111. 1383
Appears in Collections:	Dept of Electronic and Electrical Engineering Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf	Copyright © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).	3.58 MB	Adobe PDF	View/Open

Show full item record

This item is licensed under a Creative Commons License