Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/33125| Title: | A survey of latent factorization of tensor-based model compression: Algorithms, toolboxes and future directions |
| Authors: | He, Y Wu, H Liu, W Luo, X |
| Keywords: | latent factorization of tensor;model compression;resource-constrained devices;convolutional neural network;recurrent neural network;transformer |
| Issue Date: | 25-Mar-2026 |
| Publisher: | Elsevier |
| Citation: | He, Y. et al. (2026) 'A survey of latent factorization of tensor-based model compression: Algorithms, toolboxes and future directions', Neurocomputing, 682, 133455, pp. 1–20. doi: 10.1016/j.neucom.2026.133455. |
| Abstract: | Modern neural networks (NNs), while effective at learning representations from given samples and handling downstream pattern recognition tasks, typically contain tens to hundreds of millions of parameters. The growth in NN size motivates ongoing research on effective network compression with the purpose of reducing the computational burden without significantly sacrificing the model performance. It is especially critical when deploying NNs on resource-constrained devices where computation and storage efficiency are of high concern. A promising and currently popular solution to model compression is to replace the NN weight matrix with its low-rank tensor approximation, i.e., implementing an efficient latent factorization of tensors (LFT) process on the NNs parameters. Based on thorough investigations into the state-of-the-art LFT-based model compression methods, this survey 1) provides a comprehensive review of the latest research progress on LFT-based model compression methods for various NNs (e.g., Convolutional NNs, Recurrent NNs, and Transformers); 2) summarizes a number of widely-used LFT toolboxes; 3) evaluates LFT methods for model compression on a variety of main-stream NN backbones; and 4) discusses the development trends of LFT-based model compression techniques. This survey aims to provide a systematic and comprehensive overview of LFT-based model compression methods to artificial intelligence researchers and engineers, thereby promoting further research development in this crucial field. |
| Description: | Data availability: No data was used for the research described in the article. |
| URI: | https://bura.brunel.ac.uk/handle/2438/33125 |
| DOI: | https://doi.org/10.1016/j.neucom.2026.133455 |
| ISSN: | 0925-2312 |
| Other Identifiers: | ORCiD: Yaping He https://orcid.org/0009-0000-4882-1631 ORCiD: Hao Wu https://orcid.org/0000-0002-4138-1239 ORCiD: Weibo Liu https://orcid.org/0000-0002-8169-3261 ORCiD: Xin Luo https://orcid.org/0000-0002-1348-5305 |
| Appears in Collections: | Department of Computer Science Research Papers |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| FullText.pdf | Copyright © 2026 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license ( https://creativecommons.org/licenses/by/4.0/ ). | 4.03 MB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License