Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/29168
Title: Ensemble learning for optimal cluster estimation
Authors: Odebode, Afees Adegoke
Advisors: Swift, S
Tucker, A
Keywords: Voting Mechanisms;Optimal Partitioning;Aggregation strategies;Approximate Optimization;Combined Clustering
Issue Date: 2024
Publisher: Brunel University London
Abstract: This thesis addresses the importance of understanding the underlying structure of high-dimensional datasets through clustering, considering the vast amount of unlabelled available content on the internet and electronic sources. While clustering ensembles have been proposed in the past, the potential of heuristic search-based ensembles has been relatively unexplored. The thesis presents a novel computational method that combines heuristic search and clustering ensembles, focusing on two crucial issues. Firstly, it establishes a representative solution by effectively subsetting ensembles. The thesis introduces a Gray code implementation that maximises the spread across subsets while minimising differences between them. Secondly, the exhaustive search for the best solution from the representative pool becomes computationally expensive as the dimension and volume increase. An alternative approach based on heuristic search is suggested. This approach evaluates subsets incrementally, similar to the implementation of Gray code, resulting in significant speed gain. However, random mutation hill climbing (RMHC) in heuristic search suffers from finding a suitable solution without guidance, particularly in larger search spaces. The thesis presents an innovative seeding technique that leverages Fiedler vector decomposition and minimum spanning tree (MST) to address this challenge. This technique significantly improves both the quality of solutions and computational efficiency. The proposed methodology is extensively evaluated using simulated and benchmark clustering datasets, employing theoretical and empirical examples. The results demonstrate the high effectiveness of the proposed approach. The key contributions of this thesis include the introduction of Gray code subsetting of ensembles, the incorporation of heuristic-search-based techniques into clustering ensembles, and the novel improvement in search space convergence through effective seeding.
Description: This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London
URI: http://bura.brunel.ac.uk/handle/2438/29168
Appears in Collections:Computer Science
Dept of Computer Science Theses

Files in This Item:
File Description SizeFormat 
FulltextThesis.pdf2.64 MBAdobe PDFView/Open


Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.