Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/32399
Full metadata record
DC FieldValueLanguage
dc.contributor.authorLiu, Y-
dc.contributor.authorLi, M-
dc.contributor.authorKhan, M-
dc.contributor.authorQi, M-
dc.date.accessioned2025-11-24T15:36:00Z-
dc.date.available2025-11-24T15:36:00Z-
dc.date.issued2014-06-24-
dc.identifierORCiD: Maozhen Li https://orcid.org/0000-0002-0820-5487-
dc.identifier.citationLiu, Y. et al. (2014) 'A MapReduce based distributed LSI for scalable information retrieval', Computing and Informatics, 33 (2), pp. 259 - 280. Available at: https://www.cai.sk/ojs/index.php/cai/article/view/995en_US
dc.identifier.issn1335-9150-
dc.identifier.urihttps://bura.brunel.ac.uk/handle/2438/32399-
dc.description.abstractLatent Semantic Indexing (LSI) has been widely used in information retrieval due to its efficiency in solving the problems of polysemy and synonymy. However, LSI is notably a computationally intensive process because of the computing complexities of singular value decomposition and filtering operations involved in the process. This paper presents MR-LSI, a MapReduce based distributed LSI algorithm for scalable information retrieval. The performance of MR-LSI is first evaluated in a small scale experimental cluster environment, and subsequently evaluated in large scale simulation environments. By partitioning the dataset into smaller subsets and optimizing the partitioned subsets across a cluster of computing nodes, the overhead of the MR-LSI algorithm is reduced significantly while maintaining a high level of accuracy in retrieving documents of user interest. A genetic algorithm based load balancing scheme is designed to optimize the performance of MR-LSI in heterogeneous computing environments in which the computing nodes have varied resources.en_US
dc.description.sponsorshipThis research is partially supported by the 973 project on Network Big Data Analytics No. 2014CB340404.en_US
dc.format.extent259 - 280-
dc.format.mediumPrint-Electronic-
dc.language.isoen_USen_US
dc.publisherSlovak Academy of Sciencesen_US
dc.rightsCreative Commons Attribution-NonCommercial-NoDerivatives 4.0 International-
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/-
dc.source.urihttps://www.cai.sk/ojs/index.php/cai/article/view/995-
dc.subjectinformation retrievalen_US
dc.subjectlatent semantic indexingen_US
dc.subjectMapReduceen_US
dc.subjectload balancingen_US
dc.subjectgenetic algorithmsen_US
dc.titleA MapReduce based distributed LSI for scalable information retrievalen_US
dc.typeArticleen_US
dc.relation.isPartOfComputing and Informatics-
pubs.issue2-
pubs.publication-statusPublished-
pubs.volume33-
dc.rights.licensehttps://creativecommons.org/licenses/by-nc-nd/4.0/legalcode.en-
dc.rights.holderSlovak Academy of Sciences-
Appears in Collections:Dept of Electronic and Electrical Engineering Research Papers

Files in This Item:
File Description SizeFormat 
FullText.pdfCopyright © 2014 Slovak Academy of Sciences. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/).1.12 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons