Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/29713
Title: Meta-Transfer Learning-Based Handover Optimization for V2N Communication
Authors: Sohaib, RM
Onireti, O
Tan, K
Sambo, Y
Swash, R
Imran, M
Keywords: V2N;DRL;HO;generalization;meta-learning
Issue Date: 22-Jul-2024
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Citation: Sohaib, R.M. et al. (2024) 'Meta-Transfer Learning-Based Handover Optimization for V2N Communication', IEEE Transactions on Vehicular Technology, 0 (early access), pp. 1 - 15. doi: 10.1109/TVT.2024.3431875.
Abstract: The rapid growth of vehicle-to-network (V2N) communication demands efficient handover decision-making strategies to ensure seamless connectivity and maximum throughput. However, the dynamic nature of V2N scenarios poses challenges for traditional handover algorithms. To address this, we propose a deep reinforcement learning (DRL)-based approach for optimizing handover decisions in dynamic V2N communication. We leverages the advantages of transfer learning and meta-learning to generalize across time-evolving source and target tasks. In this paper, we derive generalization bounds for our DRLbased approach, specifically focusing on optimizing the handover process in V2N communication. The derived bounds provide theoretical guarantees on the expected generalization error of the learned handover time function for the target task. To implement our framework, we propose a meta-learning framework, Adaptto-evolve (A2E), based on the double deep Q-networks (DDQN) with Thompson sampling approach. The A2E framework enables quick adaptation to new tasks by minimizing the error upper bounds with divergence measures. Through transfer learning, the meta-learner dynamically evolves its handover decision-making strategy to maximize average throughput while reducing the number of handovers. We use Thompson sampling with the DDQN to balance exploration and exploitation. The DDQN with The Thompson sampling approach, ensuring efficient and effective learning, forms the foundation for optimizing the metatraining process, resulting in improvement in cumulated packet lossby48.02%in highway settings and 46.32%in rural settings
URI: https://bura.brunel.ac.uk/handle/2438/29713
DOI: https://doi.org/10.1109/TVT.2024.3431875
ISSN: 0018-9545
Other Identifiers: ORCiD: Rafiq Swash https://orcid.org/0000-0003-4242-7478
Appears in Collections:Brunel Design School Research Papers

Files in This Item:
File Description SizeFormat 
FullText.pdf5.83 MBAdobe PDFView/Open


Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.