Real-Time Object Detection and Distance Measurement Enhanced with Semantic 3D Depth Sensing Using Camera–LiDAR Fusion

Yildiz, AS; Meng, H; Swash, MR

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/31314

Title:	Real-Time Object Detection and Distance Measurement Enhanced with Semantic 3D Depth Sensing Using Camera–LiDAR Fusion
Authors:	Yildiz, AS Meng, H Swash, MR
Keywords:	computer vision;LiDAR;light detection and ranging;object detection;multi-sensor fusion;distance measurement;real-time depth extraction;semantic depth sensing;autonomous vehicles
Issue Date:	15-May-2025
Publisher:	MDPI
Citation:	Yildiz, A.S. et al. (2025) 'Real-Time Object Detection and Distance Measurement Enhanced with Semantic 3D Depth Sensing Using Camera–LiDAR Fusion', Applied Sciences, 15 (10), 5543, pp. 1 - 36. doi: 10.3390/app15105543.
Abstract:	Camera and LiDAR data fusion has been a popular research area, especially in the field of autonomous vehicles. This study evaluates the efficiency and accuracy of different depth point extraction methods, including Point-by-Point (PbyP), Complete Region Depth Extraction (CoRDE), Central Region Depth Extraction (CeRDE), and Grid Central Region Depth Extraction (GCRDE), across object categories such as person, bicycle, car, bus, and truck, and occlusion levels ranging from 0 to 3. The approaches are assessed based on extraction time, accuracy, and root mean squared error (RMSE). Bounding box-based methods, such as PbyP and CoRDE, consistently show slower extraction times compared to segmentation mask methods, with CeRDE being the most efficient in terms of computational speed. However, segmentation mask methods, particularly CeRDE and GCRDE, offer superior accuracy, especially for complex objects like trucks and cars, where bounding box methods struggle, particularly at higher occlusion levels. In terms of RMSE, segmentation mask methods consistently outperform bounding box methods, providing more precise depth estimations, particularly for larger and more occluded objects. Overall, segmentation mask methods are preferred for applications where accuracy is critical, despite their slower processing speed, while bounding box methods are suitable for real-time applications requiring faster depth extraction. GeRDE offers a balance between speed and accuracy, making it ideal for tasks needing both efficiency and precision.
Description:	Data Availability Statement: The data used in this study are from the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) Vision Benchmark 2D Object Detection Evaluation 2012 dataset. The dataset can be accessed publicly at https://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=2d (accessed on 10 January 2022).
URI:	https://bura.brunel.ac.uk/handle/2438/31314
DOI:	https://doi.org/10.3390/app15105543
Other Identifiers:	ORCiD: Ahmet Serhat Yildiz https://orcid.org/0000-0002-2957-7394 ORCiD: Hongying Meng https://orcid.org/0000-0002-8836-1382 ORCiD: Mohammad Rafiq Swash https://orcid.org/0000-0003-4242-7478 Article number: 5543
Appears in Collections:	Dept of Electronic and Electrical Engineering Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf	Copyright © 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).	25.02 MB	Adobe PDF	View/Open

Show full item record

This item is licensed under a Creative Commons License