Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/32399
Title: A MapReduce based distributed LSI for scalable information retrieval
Authors: Liu, Y
Li, M
Khan, M
Qi, M
Keywords: information retrieval;latent semantic indexing;MapReduce;load balancing;genetic algorithms
Issue Date: 24-Jun-2014
Publisher: Slovak Academy of Sciences
Citation: Liu, Y. et al. (2014) 'A MapReduce based distributed LSI for scalable information retrieval', Computing and Informatics, 33 (2), pp. 259 - 280. Available at: https://www.cai.sk/ojs/index.php/cai/article/view/995
Abstract: Latent Semantic Indexing (LSI) has been widely used in information retrieval due to its efficiency in solving the problems of polysemy and synonymy. However, LSI is notably a computationally intensive process because of the computing complexities of singular value decomposition and filtering operations involved in the process. This paper presents MR-LSI, a MapReduce based distributed LSI algorithm for scalable information retrieval. The performance of MR-LSI is first evaluated in a small scale experimental cluster environment, and subsequently evaluated in large scale simulation environments. By partitioning the dataset into smaller subsets and optimizing the partitioned subsets across a cluster of computing nodes, the overhead of the MR-LSI algorithm is reduced significantly while maintaining a high level of accuracy in retrieving documents of user interest. A genetic algorithm based load balancing scheme is designed to optimize the performance of MR-LSI in heterogeneous computing environments in which the computing nodes have varied resources.
URI: https://bura.brunel.ac.uk/handle/2438/32399
ISSN: 1335-9150
Other Identifiers: ORCiD: Maozhen Li https://orcid.org/0000-0002-0820-5487
Appears in Collections:Dept of Electronic and Electrical Engineering Research Papers

Files in This Item:
File Description SizeFormat 
FullText.pdfCopyright © 2014 Slovak Academy of Sciences. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/).1.12 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons