A survey of latent factorization of tensor-based model compression: Algorithms, toolboxes and future directions

He, Y; Wu, H; Liu, W; Luo, X

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/33125

Full metadata record

DC Field	Value	Language
dc.contributor.author	He, Y	-
dc.contributor.author	Wu, H	-
dc.contributor.author	Liu, W	-
dc.contributor.author	Luo, X	-
dc.date.accessioned	2026-04-09T16:50:28Z	-
dc.date.available	2026-04-09T16:50:28Z	-
dc.date.issued	2026-03-25	-
dc.identifier	ORCiD: Yaping He https://orcid.org/0009-0000-4882-1631	-
dc.identifier	ORCiD: Hao Wu https://orcid.org/0000-0002-4138-1239	-
dc.identifier	ORCiD: Weibo Liu https://orcid.org/0000-0002-8169-3261	-
dc.identifier	ORCiD: Xin Luo https://orcid.org/0000-0002-1348-5305	-
dc.identifier.citation	He, Y. et al. (2026) 'A survey of latent factorization of tensor-based model compression: Algorithms, toolboxes and future directions', Neurocomputing, 682, 133455, pp. 1–20. doi: 10.1016/j.neucom.2026.133455.	en-US
dc.identifier.issn	0925-2312	-
dc.identifier.uri	https://bura.brunel.ac.uk/handle/2438/33125	-
dc.description	Data availability: No data was used for the research described in the article.	en-US
dc.description.abstract	Modern neural networks (NNs), while effective at learning representations from given samples and handling downstream pattern recognition tasks, typically contain tens to hundreds of millions of parameters. The growth in NN size motivates ongoing research on effective network compression with the purpose of reducing the computational burden without significantly sacrificing the model performance. It is especially critical when deploying NNs on resource-constrained devices where computation and storage efficiency are of high concern. A promising and currently popular solution to model compression is to replace the NN weight matrix with its low-rank tensor approximation, i.e., implementing an efficient latent factorization of tensors (LFT) process on the NNs parameters. Based on thorough investigations into the state-of-the-art LFT-based model compression methods, this survey 1) provides a comprehensive review of the latest research progress on LFT-based model compression methods for various NNs (e.g., Convolutional NNs, Recurrent NNs, and Transformers); 2) summarizes a number of widely-used LFT toolboxes; 3) evaluates LFT methods for model compression on a variety of main-stream NN backbones; and 4) discusses the development trends of LFT-based model compression techniques. This survey aims to provide a systematic and comprehensive overview of LFT-based model compression methods to artificial intelligence researchers and engineers, thereby promoting further research development in this crucial field.	en-US
dc.description.sponsorship	This work was supported in part by the Science and Technology Innovation Key R&D Program of Chongqing under Grant CSTB2025TIAD-STX0032, the National Key Research and Development Program of China under Grant 2024YFF0908200, the Chongqing Technology Innovation and Application Development Special Key Project under Grant CSTB2024TIAD-KPX0018, the Royal Society of the UK under Grant IES\R3\243021, and the Southwest University Graduate Student Research Innovation Grant SWUB24051.	en-US
dc.format.extent	1–20	-
dc.format.medium	Print-Electronic	-
dc.language	en-US	en-US
dc.language.iso	en	en-US
dc.publisher	Elsevier	en-US
dc.rights	Creative Commons Attribution 4.0 International	-
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	-
dc.subject	latent factorization of tensor	en-US
dc.subject	model compression	en-US
dc.subject	resource-constrained devices	en-US
dc.subject	convolutional neural network	en-US
dc.subject	recurrent neural network	en-US
dc.subject	transformer	en-US
dc.title	A survey of latent factorization of tensor-based model compression: Algorithms, toolboxes and future directions	en-US
dc.type	Article	en-US
dc.date.dateAccepted	2026-03-24	-
dc.identifier.doi	https://doi.org/10.1016/j.neucom.2026.133455	-
dc.relation.isPartOf	Neurocomputing	-
pubs.publication-status	Published online	-
pubs.volume	682	-
dc.identifier.eissn	1872-8286	-
dc.rights.license	https://creativecommons.org/licenses/by/4.0/legalcode.en	-
dcterms.dateAccepted	2026-03-24	-
dc.rights.holder	The Authors	-
dc.contributor.orcid	He, Yaping [0009-0000-4882-1631]	-
dc.contributor.orcid	Wu, Hao [0000-0002-4138-1239]	-
dc.contributor.orcid	Liu, Weibo[0000-0002-8169-3261]	-
dc.contributor.orcid	Luo, Xin [0000-0002-1348-5305]	-
dc.identifier.number	133455	-
Appears in Collections:	Department of Computer Science Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf	Copyright © 2026 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license ( https://creativecommons.org/licenses/by/4.0/ ).	4.03 MB	Adobe PDF	View/Open

Show simple item record

This item is licensed under a Creative Commons License