Multi-scale pedestrian intent prediction using 3D joint information as spatio-temporal representation

Ahmed, S; Bazi, AA; Saha, C; Rajbhandari, S; Huda, MN

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/27446

Title:	Multi-scale pedestrian intent prediction using 3D joint information as spatio-temporal representation
Authors:	Ahmed, S Bazi, AA Saha, C Rajbhandari, S Huda, MN
Keywords:	LSTM;intent prediction;pose estimation;tracking;pedestrian detection
Issue Date:	13-Apr-2023
Publisher:	Elsevier
Citation:	Ahmed, S. et al. (2023) 'Multi-scale pedestrian intent prediction using 3D joint information as spatio-temporal representation', Expert Systems with Applications, 225 (1 September 2023), 120077, pp. 1 - 11. doi: 10.1016/j.eswa.2023.120077.
Abstract:	Copyright © 2023 The Author(s). There has been a rise of use of Autonomous Vehicles on public roads. With the predicted rise of road traffic accidents over the coming years, these vehicles must be capable of safely operate in the public domain. The field of pedestrian detection has significantly advanced in the last decade, providing high-level accuracy, with some technique reaching near-human level accuracy. However, there remains further work required for pedestrian intent prediction to reach human-level performance. One of the challenges facing current pedestrian intent predictors are the varying scales of pedestrians, particularly smaller pedestrians. This is because smaller pedestrians can blend into the background, making them difficult to detect, track or apply pose estimations techniques. Therefore, in this work, we present a novel intent prediction approach for multi-scale pedestrians using 2D pose estimation and a Long Short-term memory (LSTM) architecture. The pose estimator predicts keypoints for the pedestrian along the video frames. Based on the accumulation of these keypoints along the frames, spatio-temporal data is generated. This spatio-temporal data is fed to the LSTM for classifying the crossing behaviour of the pedestrians. We evaluate the performance of the proposed techniques on the popular Joint Attention in Autonomous Driving (JAAD) dataset and the new larger-scale Pedestrian Intention Estimation (PIE) dataset. Using data generalisation techniques, we show that the proposed technique outperformed the state-of-the-art techniques by up to 7%, reaching up to 94% accuracy while maintaining a comparable run-time of 6.1 ms.
Description:	Data availability: Only publicly available data were used.
URI:	https://bura.brunel.ac.uk/handle/2438/27446
DOI:	https://doi.org/10.1016/j.eswa.2023.120077
ISSN:	0957-4174
Other Identifiers:	ORCID iD: Sarfraz Ahmed https://orcid.org/0000-0002-9583-6710 ORCID iD: Ammar Al Bazi https://orcid.org/0000-0002-5057-4171 ORCID iD: Chitta Saha https://orcid.org/0000-0001-6831-846X ORCID iD: Sujan Rajbhandari https://orcid.org/0000-0001-8742-118X ORCID iD: M. Nazmul Huda https://orcid.org/0000-0002-5376-881X 120077
Appears in Collections:	Dept of Electronic and Electrical Engineering Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf	Copyright © 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (https://creativecommons.org/licenses/by/4.0/).	1.64 MB	Adobe PDF	View/Open

Show full item record

This item is licensed under a Creative Commons License