A survey of latent factorization of tensor-based model compression: Algorithms, toolboxes and future directions

He, Y; Wu, H; Liu, W; Luo, X

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/33125

Title:	A survey of latent factorization of tensor-based model compression: Algorithms, toolboxes and future directions
Authors:	He, Y Wu, H Liu, W Luo, X
Keywords:	latent factorization of tensor;model compression;resource-constrained devices;convolutional neural network;recurrent neural network;transformer
Issue Date:	25-Mar-2026
Publisher:	Elsevier
Citation:	He, Y. et al. (2026) 'A survey of latent factorization of tensor-based model compression: Algorithms, toolboxes and future directions', Neurocomputing, 682, 133455, pp. 1–20. doi: 10.1016/j.neucom.2026.133455.
Abstract:	Modern neural networks (NNs), while effective at learning representations from given samples and handling downstream pattern recognition tasks, typically contain tens to hundreds of millions of parameters. The growth in NN size motivates ongoing research on effective network compression with the purpose of reducing the computational burden without significantly sacrificing the model performance. It is especially critical when deploying NNs on resource-constrained devices where computation and storage efficiency are of high concern. A promising and currently popular solution to model compression is to replace the NN weight matrix with its low-rank tensor approximation, i.e., implementing an efficient latent factorization of tensors (LFT) process on the NNs parameters. Based on thorough investigations into the state-of-the-art LFT-based model compression methods, this survey 1) provides a comprehensive review of the latest research progress on LFT-based model compression methods for various NNs (e.g., Convolutional NNs, Recurrent NNs, and Transformers); 2) summarizes a number of widely-used LFT toolboxes; 3) evaluates LFT methods for model compression on a variety of main-stream NN backbones; and 4) discusses the development trends of LFT-based model compression techniques. This survey aims to provide a systematic and comprehensive overview of LFT-based model compression methods to artificial intelligence researchers and engineers, thereby promoting further research development in this crucial field.
Description:	Data availability: No data was used for the research described in the article.
URI:	https://bura.brunel.ac.uk/handle/2438/33125
DOI:	https://doi.org/10.1016/j.neucom.2026.133455
ISSN:	0925-2312
Other Identifiers:	ORCiD: Yaping He https://orcid.org/0009-0000-4882-1631 ORCiD: Hao Wu https://orcid.org/0000-0002-4138-1239 ORCiD: Weibo Liu https://orcid.org/0000-0002-8169-3261 ORCiD: Xin Luo https://orcid.org/0000-0002-1348-5305
Appears in Collections:	Department of Computer Science Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf	Copyright © 2026 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license ( https://creativecommons.org/licenses/by/4.0/ ).	4.03 MB	Adobe PDF	View/Open

Show full item record

This item is licensed under a Creative Commons License