Ensemble learning for optimal cluster estimation

Odebode, Afees Adegoke

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/29168

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Swift, S	-
dc.contributor.advisor	Tucker, A	-
dc.contributor.author	Odebode, Afees Adegoke	-
dc.date.accessioned	2024-06-13T12:53:52Z	-
dc.date.available	2024-06-13T12:53:52Z	-
dc.date.issued	2024	-
dc.identifier.uri	http://bura.brunel.ac.uk/handle/2438/29168	-
dc.description	This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London	en_US
dc.description.abstract	This thesis addresses the importance of understanding the underlying structure of high-dimensional datasets through clustering, considering the vast amount of unlabelled available content on the internet and electronic sources. While clustering ensembles have been proposed in the past, the potential of heuristic search-based ensembles has been relatively unexplored. The thesis presents a novel computational method that combines heuristic search and clustering ensembles, focusing on two crucial issues. Firstly, it establishes a representative solution by effectively subsetting ensembles. The thesis introduces a Gray code implementation that maximises the spread across subsets while minimising differences between them. Secondly, the exhaustive search for the best solution from the representative pool becomes computationally expensive as the dimension and volume increase. An alternative approach based on heuristic search is suggested. This approach evaluates subsets incrementally, similar to the implementation of Gray code, resulting in significant speed gain. However, random mutation hill climbing (RMHC) in heuristic search suffers from finding a suitable solution without guidance, particularly in larger search spaces. The thesis presents an innovative seeding technique that leverages Fiedler vector decomposition and minimum spanning tree (MST) to address this challenge. This technique significantly improves both the quality of solutions and computational efficiency. The proposed methodology is extensively evaluated using simulated and benchmark clustering datasets, employing theoretical and empirical examples. The results demonstrate the high effectiveness of the proposed approach. The key contributions of this thesis include the introduction of Gray code subsetting of ensembles, the incorporation of heuristic-search-based techniques into clustering ensembles, and the novel improvement in search space convergence through effective seeding.	en_US
dc.publisher	Brunel University London	en_US
dc.relation.uri	http://bura.brunel.ac.uk/handle/2438/29168/1/FulltextThesis.pdf	-
dc.subject	Voting Mechanisms	en_US
dc.subject	Optimal Partitioning	en_US
dc.subject	Aggregation strategies	en_US
dc.subject	Approximate Optimization	en_US
dc.subject	Combined Clustering	en_US
dc.title	Ensemble learning for optimal cluster estimation	en_US
dc.type	Thesis	en_US
Appears in Collections:	Computer Science Dept of Computer Science Theses

Files in This Item:

File	Description	Size	Format
FulltextThesis.pdf		2.64 MB	Adobe PDF	View/Open

Show simple item record