Language-guided zero-shot segmentation with multi-angle reprojection for point cloud analysis

Ayodeji, A; Teyeb, A; Abbas, MAA; Bass, P; Bass, E; Bandara, PD; Jayasinghe, UK; Griffiths, J; El Masri, E

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/32958

Title:	Language-guided zero-shot segmentation with multi-angle reprojection for point cloud analysis
Authors:	Ayodeji, A Teyeb, A Abbas, MAA Bass, P Bass, E Bandara, PD Jayasinghe, UK Griffiths, J El Masri, E
Keywords:	virtual reality;zero-shot 3D object detection;point cloud labelling;foundation models;language-guided segmentation;grounding;DINO;segment anything model (SAM);semantic labelling;confidence-weighted fusion;augmented reality
Issue Date:	10-Sep-2025
Publisher:	Elsevier
Citation:	Ayodeji, A. et al. (2025) 'Language-guided zero-shot segmentation with multi-angle reprojection for point cloud analysis', Measurement: Digitalization, 4, 100016 , pp. 1–14. doi: 10.1016/j.meadig.2025.100016.
Abstract:	Virtual Reality applications increasingly demand accurate 3D representations of real-world environments. While LiDAR point clouds capture physical spaces with high fidelity, they typically lack semantic labels, limiting their direct use for tasks such as object recognition, interaction modeling, and automation in immersive environments or digital twin systems. We present a LAnguage-guided zero-shot 3D SEgmentation and Reprojection tool (LASER), an engineered zero-shot segmentation tool that extends the state of the art by introducing language-guided 3D object detection for enhanced usability and accuracy. Unlike its predecessors, LASER uses an ensemble of GroundingDINO and Segment Anything Model as its backbone to process natural language queries and user-specified object categories, automated multi-view orthophoto generation with dynamic angles for optimal view selection, a confidence-weighted fusion algorithm for efficient 2D-3D reprojection, and a semantically labelled mesh output. The LASER pipeline begins by collecting point cloud data using LiDAR sensors, filtering the point cloud into ground and non-ground components, improving segmentation efficiency. It then generates multi-angle 2D orthophotos and perspective views, incorporating a user-guided angle selection module to optimise scene coverage. Then GroundingDINO detects objects based on textual descriptions, and Segment Anything Model subsequently refines these into segmentation masks. The core innovation of LASER lies in its confidence-weighted reprojection algorithm, which fuses multiple 2D segmentation results back into 3D space, ensuring higher segmentation accuracy and spatial consistency. The resulting semantically labelled assets can be exported in standard formats or iteratively refined through viewpoint adjustments or text prompt modifications. Our application of LASER to real-world 3D scans of construction sites demonstrates its effectiveness in delivering high segmentation precision, enhanced user interactivity, and seamless integration into virtual reality workflows. To comprehensively evaluate the proposed tool on diverse point cloud scans, we also presented the performance on four different test cases using two different scans (3DSES and Toronto3D) with both indoor and outdoor scenes. The results show consistent performance across scans. Finally, feature-based comparison with state-of-the-art approaches shows that LASER is an optimised tool for enriching static, open-world 3D scans with semantic labels, offering an alternative to existing state-of-the-art methods for niche applications.
Description:	Highlight: • Language-guided zero-shot 3D segmentation and object extraction. • User-guided multi-angle orthophoto generation for improved scene coverage. • GroundingDINO-based text-prompted detection for intuitive object detection. • Confidence-weighted fusion ensuring accurate and consistent 3D reprojection. • Flexible viewpoint selection and iterative refinement for enhanced usability. Data availability: Data will be made available on request.
URI:	https://bura.brunel.ac.uk/handle/2438/32958
DOI:	https://doi.org/10.1016/j.meadig.2025.100016
Other Identifiers:	ORCiD: Abiodun Ayodeji https://orcid.org/0000-0003-3257-7616 ORCiD: Ahmed Teyeb https://orcid.org/0000-0003-0300-1845 ORCiD: Udari K. Jayasinghe https://orcid.org/0000-0002-8702-2442 ORCiD: Evelyne El Masri https://orcid.org/0000-0003-3241-5844
Appears in Collections:	Brunel Innovation Centre

Files in This Item:

File	Description	Size	Format
FullText.pdf	Copyright © 2025 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license ( https://creativecommons.org/licenses/by/4.0/ ).	9.63 MB	Adobe PDF	View/Open

Show full item record

This item is licensed under a Creative Commons License