Multi-UAV Trajectory Design and Power Control Based on Deep Reinforcement Learning

Zhang, C; Liang, S; He, C; Wang, K

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/25934

Full metadata record

DC Field	Value	Language
dc.contributor.author	Zhang, C	-
dc.contributor.author	Liang, S	-
dc.contributor.author	He, C	-
dc.contributor.author	Wang, K	-
dc.date.accessioned	2023-02-08T09:34:56Z	-
dc.date.available	2023-02-08T09:34:56Z	-
dc.date.issued	2021-02-16	-
dc.identifier	ORCID iD: Kezhi Wang https://orcid.org/0000-0001-8602-0800	-
dc.identifier.citation	Zhang, C. et al. (2022) 'Multi-UAV Trajectory Design and Power Control Based on Deep Reinforcement Learning', Journal of Communications and Information Networks, 2022, 7 (2), pp. 192 - 201. doi: 10.23919/JCIN.2022.9815202	en_US
dc.identifier.issn	2096-1081	-
dc.identifier.uri	https://bura.brunel.ac.uk/handle/2438/25934	-
dc.description.abstract	In this paper,multi-unmanned aerial vehicle (multi-UAV) and multi-user system are studied, where UAVs are served as aerial base stations (BS) for ground users in the same frequency band without knowing the locations and channel parameters for the users. We aim to maximize the total throughput for all the users and meet the fairness requirement by optimizing the UAVs’ trajectories and transmission power in a centralized way. This problem is non-convex and very difficult to solve,as the locations of the user are unknown to the UAVs. We propose a deep reinforcement learning(DRL)-based solution,i.e.,soft actor-critic(SAC)to address it via modeling the problem as a Markov decision process (MDP). We carefully design the reward function that combines sparse with non-sparse reward to achieve the balance between exploitation and exploration.The simulation results show that the proposed SAC has a very good performance in terms of both training and testing.	en_US
dc.description.sponsorship	National Natural Science Foundation of China under Grant 62101161; Shenzhen Basic Research Program under Grant 20200811192821001 and Grant JCYJ20190808122409660; Guangdong Basic Research Program under Grant 2019A1515110358, Grant 2021A1515012097, Grant 2020ZDZX1037, Grant 2020ZDZX1021; open research fund of National Mobile Communications Research Laboratory, Southeast University under Grant 2021D16 and Grant 2022D02.	en_US
dc.format.extent	192 - 201	-
dc.language	English	-
dc.language.iso	en_US	en_US
dc.publisher	China InfoCom Media Group	en_US
dc.subject	deep reinforcement learning	en_US
dc.subject	mobile edge computing	en_US
dc.subject	unmanned aerial vehicle (UAV)	en_US
dc.subject	trajectory control	en_US
dc.subject	user association	en_US
dc.title	Multi-UAV Trajectory Design and Power Control Based on Deep Reinforcement Learning	en_US
dc.type	Article	en_US
dc.identifier.doi	https://doi.org/10.23919/JCIN.2022.9815202	-
dc.relation.isPartOf	Journal of Communications and Information Networks	-
pubs.issue	2	-
pubs.publication-status	Published	-
pubs.volume	7	-
dc.identifier.eissn	2509-3312	-
dc.rights.holder	China InfoCom Media Group	-
Appears in Collections:	Department of Computer Science Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf		1.15 MB	Adobe PDF	View/Open

Show simple item record