Aussie AI
Low-Rank Matrices
-
Last Updated 17 November, 2025
-
by David Spuler, Ph.D.
Low-rank matrices are matrices with smaller dimensions (i.e. fewer rows or columns). One form of model compression is to use matrix techniques to replace the large weight matrices with smaller "low-rank" matrices. This makes the model faster, but sometimes trades off decreased accuracy.
There are various approaches to find smaller matrices to replace a full-sized matrix. One approach is simply to look for matrices that are similar to the large model, but smaller. Another approach is to use "sparsification" to add a lot of zeros to the matrices, such that a smaller model can more easily replace it. Yet another approach is to use matrix algebra to "factorize" (also called "decompose") the large matrix into one or more smaller matrices (see also AI matrix algebra).
One common low-rank matrix technique has become popular, possibly because it's been given a friendly name: LoRA is "Low-Rank Adaptation" of matrices. If the model has been quantized first, then it is QLoRA, for "Quantized LoRA".
Singular Value Decomposition (SVD)
SVD is one of the methods of factorizing matrices into smaller sub-matrices. Research on SVD includes:
- Zeyu Zhang, Haiying Shen, 7 Aug 2024, Zero-Delay QKV Compression for Mitigating KV Cache and Network Bottlenecks in LLM Inference, https://arxiv.org/abs/2408.04107
- Chi-Chih Chang, Wei-Cheng Lin, Chien-Yu Lin, Chong-Yan Chen, Yu-Fang Hu, Pei-Shuo Wang, Ning-Chi Huang, Luis Ceze, Kai-Chiang Wu, 30 Jul 2024, Palu: Compressing KV-Cache with Low-Rank Projection, https://arxiv.org/abs/2407.21118 https://github.com/shadowpa0327/Palu
- Hongyaoxing Gu, 27 May 2024, LRAMM -- Low precision approximates GEMM via RSVD, https://arxiv.org/abs/2405.16917
- Yuren Mao, Yuhang Ge, Yijiang Fan, Wenyi Xu, Yu Mi, Zhonghao Hu, Yunjun Gao 12 Aug 2024 (v3), A Survey on LoRA of Large Language Models, https://arxiv.org/abs/2407.11046 https://github.com/ZJU-LLMs/Awesome-LoRAs.git
- Shi, J., Shi, C. (2025). Improve LLM Inference Performance with Matrix Decomposition Strategies. In: Shi, Z., Witbrock, M., Tian, Q. (eds) Intelligence Science V. ICIS 2024. IFIP Advances in Information and Communication Technology, vol 720. Springer, Cham. https://doi.org/10.1007/978-3-031-71253-1_12 https://link.springer.com/chapter/10.1007/978-3-031-71253-1_12 (Speed up matrix operations with SVD and NMF via adaptive block sizing based on batching.)
- Xinghao Wang, Pengyu Wang, Bo Wang, Dong Zhang, Yunhua Zhou, Xipeng Qiu, 31 Oct 2024, BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments, https://arxiv.org/abs/2410.23918 https://github.com/xinghaow99/BitStack
- Shengwen Ding, Chenhui Hu, 24 Nov 2024, eFedLLM: Efficient LLM Inference Based on Federated Learning, https://arxiv.org/abs/2411.16003
- Haoyang Li, Yiming Li, Anxin Tian, Tianhao Tang, Zhanchao Xu, Xuejia Chen, Nicole Hu, Wei Dong, Qing Li, Lei Chen, 27 Dec 2024, A Survey on Large Language Model Acceleration based on KV Cache Management, https://arxiv.org/abs/2412.19442 (Huge survey of all KV cache optimization methods.)
- Hong Yankun, Li Xing, Zhen Hui-Ling, Yu Xianzhi, Liu Wulong, Yuan Mingxuan, 21 Feb 2025, SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention, https://arxiv.org/abs/2502.15304
- Xin Wang, Samiul Alam, Zhongwei Wan, Hui Shen, Mi Zhang, 16 Mar 2025, SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression, https://arxiv.org/abs/2503.12340 https://github.com/AIoT-MLSys-Lab/SVD-LLM
- Jiujun He, Huazhen Lin, 10 Jun 2025, Olica: Efficient Structured Pruning of Large Language Models without Retraining, https://arxiv.org/abs/2506.08436
- Tavor Z. Baharav, Phillip B. Nicol, Rafael A. Irizarry, Rong Ma, 29 Jul 2025, Stacked SVD or SVD stacked? A Random Matrix Theory perspective on data integration, https://arxiv.org/abs/2507.22170
- Jiayu Fang, Zhiqi Shao, S T Boris Choy, Junbin Gao, 19 Aug 2025, SVDformer: Direction-Aware Spectral Graph Embedding Learning via SVD and Transformer, https://arxiv.org/abs/2508.13435
- Mete Erdogan, Sebnem Demirtas, 25 Aug 2025, SVD Based Least Squares for X-Ray Pneumonia Classification Using Deep Features, https://arxiv.org/abs/2504.20970
- Johannes J. Brust and Michael A. Saunders, 2 Sep 2025, Fast and Accurate SVD-Type Updating in Streaming Data, https://arxiv.org/abs/2509.02840
- Daniel D. Li, May 2025, Efficient ML Inference via Matrix-Vector Approximations, Master's Thesis, Department of Electrical Engineering and Computer Science, MIT, https://dspace.mit.edu/bitstream/handle/1721.1/162737/li-ddl-meng-eecs-2025-thesis.pdf?sequence=1&isAllowed=y
- Abdulla Jasem Almansoori, Maria Ivanova, Andrey Veprikov, Aleksandr Beznosikov, Samuel Horv\'ath, Martin Tak\'a\v{c}, 24 Sep 2025, Faster Than SVD, Smarter Than SGD: The OPLoRA Alternating Update, https://arxiv.org/abs/2509.19977
- Shen Yuan, Yin Zheng, Taifeng Wang, Binbin Liu, and Hongteng Xu, 23 Oct 2025, MoORE: SVD-based Model MoE-ization for Conflict- and Oblivion-Resistant Multi-Task Adaptation, https://arxiv.org/abs/2506.14436
- Boya Xiong, Shuo Wang, Weifeng Ge, Guanhua Chen, Yun Chen, 27 Sep 2025, Enhancing Delta Compression in LLMs via SVD-based Quantization Error Minimization, https://arxiv.org/abs/2506.11087
- Minchan Jeong, J. Jon Ryu, Se-Young Yun, Gregory W. Wornell, 24 Oct 2025, Efficient Parametric SVD of Koopman Operator for Stochastic Dynamical Systems, https://arxiv.org/abs/2507.07222
- Yassir Jedra, Devavrat Shah, 11 Oct 2025, $k$-SVD with Gradient Descent, https://arxiv.org/abs/2502.00320
- Lin Xv, Jingsheng Gao, Xian Gao, Ting Liu, Yuzhuo Fu, 22 Oct 2025, ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression, https://arxiv.org/abs/2510.19389
Research on Low-Rank Matrices
- Li, Y.; Yu, Y.; Zhang, Q.; Liang, C.; He, P.; Chen, W.; and Zhao, T. 2023. LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation. In Krause, A.; Brunskill, E.; Cho, K.; Engelhardt, B.; Sabato, S.; and Scarlett, J., eds., Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, 20336–20350. PMLR. https://arxiv.org/abs/2306.11222
- Ma, X.; Fang, G.; and Wang, X. 2023. LLM-Pruner: On the Structural Pruning of Large Language Models. arXiv:2305.11627. https://arxiv.org/abs/2305.11627 Code: https://github.com/horseee/LLM-Pruner (Pruning during training and LoRA.)
- M. Jaderberg, A. Vedaldi, and A. Zisserman. Speeding up convolutional neural networks with low rank expansions. BMVC, 2014, https://arxiv.org/abs/1405.3866, PDF: https://www.robots.ox.ac.uk/~vgg/publications/2014/Jaderberg14b/jaderberg14b.pdf
- Y.-D. Kim, E. Park, S. Yoo, T. Choi, L. Yang and D. Shin, "Compression of deep convolutional neural networks for fast and low power mobile applications", arXiv:1511.06530, 2015. https://arxiv.org/abs/1511.06530 (Low-rank via Bayesian matrix factorization and Tucker decomposition.)
- V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets and V. Lempitsky, "Speeding-up convolutional neural networks using fine-tuned CP-decomposition", arXiv:1412.6553, 2014. https://arxiv.org/abs/1412.6553
- Ali Edalati, Marzieh Tahaei, Ivan Kobyzev, Vahid Partovi Nia, James J. Clark, Mehdi Rezagholizadeh, Dec 2022, KronA: Parameter Efficient Tuning with Kronecker Adapter, arXiv preprint arXiv:2212.10650, https://arxiv.org/abs/2212.10650 (Kronecker product for matrix decomposition.)
- Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, and Ali Ghodsi. 2022. DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation, arXiv preprint arXiv:2210.07558. https://arxiv.org/abs/2210.07558
- Rabeeh Karimi Mahabadi, James Henderson, and Sebastian Ruder. 2021. Compacter: Efficient low-rank hypercomplex adapter layers. Advances in Neural Information Processing Systems, 34:1022–1035. https://arxiv.org/abs/2106.04647
- Yukang Chen, Shengju Qian, Haotian Tang, Xin Lai, Zhijian Liu, Song Han, Jiaya Jia, Sep 2023, LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models, https://arxiv.org/abs/2309.12307 (Low-rank matrix attention allows up to 100k context windows.)
- R Saha, V Srivastava, M Pilanci, 2023, Matrix Compression via Randomized Low Rank and Low Precision Factorization, 37th Conference on Neural Information Processing Systems (NeurIPS 2023), https://web.stanford.edu/~pilanci/papers/lplr.pdf
- F Babiloni, T Tanay, J Deng, M Maggioni, S Zafeiriou, 2023, Factorized Dynamic Fully-Connected Layers for Neural Networks, ICCV workshop, https://openaccess.thecvf.com/content/ICCV2023W/RCV/papers/Babiloni_Factorized_Dynamic_Fully-Connected_Layers_for_Neural_Networks_ICCVW_2023_paper.pdf (Tensor decomposition into low-rank factors.)
- Samuel Carreira, Tomás Marques, José Ribeiro, Carlos Grilo, Sep 2023, Revolutionizing Mobile Interaction: Enabling a 3 Billion Parameter GPT LLM on Mobile, arXiv preprint arXiv:2310.01434, https://browse.arxiv.org/abs/2310.01434 (LoRA on a mobile platform.)
- Tamara G Kolda and Brett W Bader, 2009, Tensor Decompositions and Applications, SIAM Rev. 51, 3 (2009), 455–500, https://epubs.siam.org/doi/abs/10.1137/07070111X (Analysis of various algorithms for tensor decomposition.)
- Stephan Rabanser, Oleksandr Shchur, Stephan Günnemann, Nov 2017, Introduction to Tensor Decompositions and their Applications in Machine Learning, https://browse.arxiv.org/pdf/1711.10781.pdf
- Yong-Deok Kim, Eunhyeok Park, Sungjoo Yoo, Taelim Choi, Lu Yang, and Dongjun Shin. 2016. Compression of deep convolutional neural networks for fast and low power mobile applications. In Proceedings of the International Conference on Learning Representations (ICLR). https://arxiv.org/abs/1511.06530 (Uses Tucker decomposition and Bayesian matrix factorization algorithms.)
- Yixiao Li, Yifan Yu, Chen Liang, Pengcheng He, Nikos Karampatziakis, Weizhu Chen, Tuo Zhao, Oct 2023, LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models, https://arxiv.org/abs/2310.08659 (QLoRA for LLMs.)
- Chakshu Moar, Michael Pellauer, Hyoukjun Kwon, 10 May 2024, Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models, https://arxiv.org/abs/2405.06626
- You Zhou, Xiujing Lin, Xiang Zhang, Maolin Wang, Gangwei Jiang, Huakang Lu, Yupeng Wu, Kai Zhang, Zhe Yang, Kehang Wang, Yongduo Sui, Fengwei Jia, Zuoli Tang, Yao Zhao, Hongxuan Zhang, Tiannuo Yang, Weibo Chen, Yunong Mao, Yi Li, De Bao, Yu Li, Hongrui Liao, Ting Liu, Jingwen Liu, Jinchi Guo, Xiangyu Zhao, Ying WEI, Hong Qian, Qi Liu, Xiang Wang, Wai Kin (Victor)Chan, Chenliang Li, Yusen Li, Shiyu Yang, Jining Yan, Chao Mou, Shuai Han, Wuxia Jin, Guannan Zhang, Xiaodong Zeng, Nov 2023, On the Opportunities of Green Computing: A Survey, https://arxiv.org/abs/2311.00447 (Extensive survey of environmental and green AI issues, along with a survey of various optimization methods to reduce AI resource requirements in training and inference.)
- Davis, Andrew and Arel, Itamar. 2013. Low-rank approximations for conditional feedforward computation in deep neural networks. arXiv preprint arXiv:1312.4461, https://arxiv.org/abs/1312.4461
- Y Hu, J Zhang, C Zhao, C Li, H Chen, 2023, Transformer Compression via Subspace Projection, arXiv preprint arXiv:2308.16475, https://arxiv.org/abs/2308.16475
- Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang. A survey of model compression and acceleration for deep neural networks. CoRR, abs/1710.09282, 2017. https://arxiv.org/abs/1710.09282
- Shikai Qiu, Andres Potapczynski, Marc Finzi, Micah Goldblum, Andrew Gordon Wilson, 10 Jun 2024, Compute Better Spent: Replacing Dense Layers with Structured Matrices, https://arxiv.org/abs/2406.06248
- Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer, 15 Mar 2024 (v5), LLM Inference Unveiled: Survey and Roofline Model Insights, https://arxiv.org/abs/2402.16363 Code: https://github.com/hahnyuan/LLM-Viewer (A large survey of a variety of LLM optimizations.)
- Arnav Chavan, Nahush Lele, Deepak Gupta, Dec 2023, Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models https://arxiv.org/abs/2312.07046 Code: https://github.com/transmuteAI/trailmet/tree/main/trailmet/algorithms/llm-rom
- S. Wang, B. Z. Li, M. Khabsa, H. Fang, and H. Ma, “Linformer: Self-attention with linear complexity,” CoRR, vol. abs/2006.04768, 2020. https://arxiv.org/abs/2006.04768 (Low-rank approximation of attention.)
- Idelbayev, Y. and Carreira-Perpinan, M. A. (2020). Low-rank compression of neural nets: Learning the rank of each layer. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8046–8056. URL: https://openaccess.thecvf.com/content_CVPR_2020/html/Idelbayev_Low_Rank_Compression_of_Neural_Nets_Learning_the_Rank_of_Each_CVPR_2020_ paper.html
- Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861. URL: http://arxiv.org/abs/1704.04861.
- Zhang, J., Lei, Q., and Dhillon, I. (2018). Stabilizing gradients for deep neural networks via efficient SVD parameterization. In Dy, J. and Krause, A., editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 5806–5814. PMLR. URL: http://proceedings.mlr.press/v80/zhang18g.html
- K Nan, S Liu, J Du, H Liu, 2019, Deep model compression for mobile platforms: A survey, Tsinghua Science and Technology (Volume 24, Issue 6, December 2019), https://ieeexplore.ieee.org/abstract/document/8727762 PDF: https://ieeexplore.ieee.org/iel7/5971803/8727756/08727762.pdf
- Zheng Qu, Liu Liu, Fengbin Tu, Zhaodong Chen, Yufei Ding, Yuan Xie, 2022, DOTA: Detect and Omit Weak Attentions for Scalable Transformer Acceleration, ASPLOS ’22, February 28 ś March 4, 2022, Lausanne, Switzerland, PDF: https://dl.acm.org/doi/pdf/10.1145/3503222.3507738
- Ivan Markovsky, Aug 3, 2018, Low-Rank Approximation: Algorithms, Implementation, Applications (Communications and Control Engineering) Part of: Communications and Control Engineering (62 books), https://www.amazon.com/Low-Rank-Approximation-Implementation-Applications-Communications/dp/3319896199/
- Saleh Ashkboos, Maximilian L. Croci, Marcelo Gennari do Nascimento, Torsten Hoefler, James Hensman, 9 Feb 2024 (v2), SliceGPT: Compress Large Language Models by Deleting Rows and Columns, Microsoft Research, https://arxiv.org/abs/2401.15024 Code: https://github.com/microsoft/TransformerCompression (Pruning of matrices effectively prunes along the width dimension and the "fourth" internal dimension of embeddings using techniques such as low-rank matrix factorization.)
- Wenxiao Wang, Wei Chen, Yicong Luo, Yongliu Long, Zhengkai Lin, Liye Zhang, Binbin Lin, Deng Cai, Xiaofei He, 15 Feb 2024, Model Compression and Efficient Inference for Large Language Models: A Survey, https://arxiv.org/abs/2402.09748
- Yunsheng Li, Yinpeng Chen, Xiyang Dai, Dongdong Chen, Mengchen Liu, Lu Yuan, Zicheng Liu, Lei Zhang, Nuno Vasconcelos, MicroNet: Improving Image Recognition with Extremely Low FLOPs, 2021, https://ieeexplore.ieee.org/abstract/document/9857393 PDF: https://openaccess.thecvf.com/content/ICCV2021/papers/Li_MicroNet_Improving_Image_Recognition_With_Extremely_Low_FLOPs_ICCV_2021_paper.pdf
- Yubin Qin, Yang Wang, Zhiren Zhao, Xiaolong Yang, Yang Zhou, Shaojun Wei, Yang Hu, Shouyi Yin, 2024, MECLA: Memory-Compute-Efficient LLM Accelerator with Scaling Sub-matrix Partition, 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA), Year: 2024, Pages: 1032-1047, DOI Bookmark: 10.1109/ISCA59077.2024.00079, https://www.computer.org/csdl/proceedings-article/isca/2024/265800b032/1Z3pCEBnapO
- Jiuxiang Gu, Yingyu Liang, Heshan Liu, Zhenmei Shi, Zhao Song, Junze Yin, 8 May 2024, Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers, https://arxiv.org/abs/2405.05219 (Attention optimization using multiple low-rank matrices.)
- Canwen Xu, Julian McAuley, Nov 2022, A Survey on Model Compression and Acceleration for Pretrained Language Models, https://arxiv.org/abs/2202.07105
- Xiuying Wei, Skander Moalla, Razvan Pascanu, Caglar Gulcehre, 24 Jun 2024, Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers, https://arxiv.org/abs/2406.16450 Code: https://github.com/CLAIRE-Labo/StructuredFFN/tree/main
- Utkarsh Saxena, Gobinda Saha, Sakshi Choudhary, Kaushik Roy, 10 Aug 2024, Eigen Attention: Attention in Low-Rank Space for KV Cache Compression, https://arxiv.org/abs/2408.05646
- Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song, Yufa Zhou, 23 Aug 2024, Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time, https://arxiv.org/abs/2408.13233 (Training using low-rank matrices to approximate attention.)
- Josh Alman, Zhao Song, 9 May 2023 (v2), Fast Attention Requires Bounded Entries, https://arxiv.org/abs/2302.13214 (Low-rank matrices in attention for fast inference.)
- Josh Alman, Zhao Song, 6 Oct 2023, How to Capture Higher-order Correlations? Generalizing Matrix Softmax Attention to Kronecker Computation, https://arxiv.org/abs/2310.04064
- Chi-Chih Chang, Wei-Cheng Lin, Chien-Yu Lin, Chong-Yan Chen, Yu-Fang Hu, Pei-Shuo Wang, Ning-Chi Huang, Luis Ceze, Kai-Chiang Wu, 30 Jul 2024, Palu: Compressing KV-Cache with Low-Rank Projection, https://arxiv.org/abs/2407.21118 https://github.com/shadowpa0327/Palu
- Sneha Mehta, Huzefa Rangwala, Naren Ramakrishnan, 10 Aug 2020 (v2), Low Rank Factorization for Compact Multi-Head Self-Attention, https://arxiv.org/abs/1912.00835
- Ignacio Hounie, Charilaos Kanatsoulis, Arnuv Tandon, Alejandro Ribeiro, 5 Oct 2024, LoRTA: Low Rank Tensor Adaptation of Large Language Models, https://arxiv.org/abs/2410.04060
- Yue Zheng, Yuhao Chen, Bin Qian, Xiufang Shi, Yuanchao Shu, Jiming Chen, 29 Sep 2024, A Review on Edge Large Language Models: Design, Execution, and Applications, https://arxiv.org/abs/2410.11845
- Zebin Yang, Renze Chen, Taiqiang Wu, Ngai Wong, Yun Liang, Runsheng Wang, Ru Huang, Meng Li, 23 Oct 2024, MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers https://arxiv.org/abs/2410.17957
- Elias Jääsaari, Ville Hyvönen, Teemu Roos, 24 Oct 2024, LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor Search, https://arxiv.org/abs/2410.18926
- Xinghao Wang, Pengyu Wang, Bo Wang, Dong Zhang, Yunhua Zhou, Xipeng Qiu, 31 Oct 2024, BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments, https://arxiv.org/abs/2410.23918 https://github.com/xinghaow99/BitStack
- Liang Mi, Weijun Wang, Wenming Tu, Qingfeng He, Rui Kong, Xinyu Fang, Yazhu Dong, Yikang Zhang, Yunchun Li, Meng Li, Haipeng Dai, Guihai Chen, Yunxin Liu, 1 Nov 2024, V-LoRA: An Efficient and Flexible System Boosts Vision Applications with LoRA LMM, https://arxiv.org/abs/2411.00915
- Fali Wang, Zhiwei Zhang, Xianren Zhang, Zongyu Wu, Tzuhao Mo, Qiuhao Lu, Wanjing Wang, Rui Li, Junjie Xu, Xianfeng Tang, Qi He, Yao Ma, Ming Huang, Suhang Wang, 4 Nov 2024, A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness, https://arxiv.org/abs/2411.03350
- M Xu, D Cai, W Yin, S Wang, X Jin, X Liu - ACM Computing Surveys, 2024, Resource-efficient Algorithms and Systems of Foundation Models: A Survey, https://dl.acm.org/doi/pdf/10.1145/3706418
- Meyer Scetbon, James Hensman, 10 Dec 2024, Low-Rank Correction for Quantized LLMs, https://arxiv.org/abs/2412.07902
- Kwangryeol Park, Seulki Lee, 12 Dec 2024, SMMF: Square-Matricized Momentum Factorization for Memory-Efficient Optimization, https://arxiv.org/abs/2412.08894 (Gradient optimizer Adam optimized using low-rank matrix factorization.)
- Haoyang Li, Yiming Li, Anxin Tian, Tianhao Tang, Zhanchao Xu, Xuejia Chen, Nicole Hu, Wei Dong, Qing Li, Lei Chen, 27 Dec 2024, A Survey on Large Language Model Acceleration based on KV Cache Management, https://arxiv.org/abs/2412.19442 (Huge survey of all KV cache optimization methods.)
- Jingcheng Hu, Houyi Li, Yinmin Zhang, Zili Wang, Shuigeng Zhou, Xiangyu Zhang, Heung-Yeung Shum, 26 Dec 2024, Multi-matrix Factorization Attention, https://arxiv.org/abs/2412.19255
- Menglin Yang, Jialin Chen, Yifei Zhang, Jiahong Liu, Jiasheng Zhang, Qiyao Ma, Harshit Verma, Qianru Zhang, Min Zhou, Irwin King, Rex Ying, 31 Dec 2024, Low-Rank Adaptation for Foundation Models: A Comprehensive Review, https://arxiv.org/abs/2501.00365 (Extensive survey of LoRA.)
- Q Wang, S Shen, Jan 2025, Activation-Guided Low-Rank Parameter Adaptation for Efficient Model Fine-Tuning, IEEE Access, https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10852296 (Modified LoRA algorithm using activations for weighting.)
- Xin Wang, Samiul Alam, Zhongwei Wan, Hui Shen, Mi Zhang, 16 Mar 2025, SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression, https://arxiv.org/abs/2503.12340 https://github.com/AIoT-MLSys-Lab/SVD-LLM
- Yiping Ji, Hemanth Saratchandran, Cameron Gordon, Zeyu Zhang, Simon Lucey, 17 Mar 2025 (v5), Efficient Learning With Sine-Activated Low-rank Matrices, ICLR 2025, AIML, https://arxiv.org/abs/2403.19243
- Ray Zirui Zhang, Christopher E. Miles, Xiaohui Xie, John S. Lowengrub, 22 Jul 2025, BiLO: Bilevel Local Operator Learning for PDE Inverse Problems. Part II: Efficient Uncertainty Quantification with Low-Rank Adaptation, https://arxiv.org/abs/2507.17019
- Yao Wang, Jiannan Li, Yue Kang, Shanxing Gao, Zhenxin Xiao, 23 Jul 2025, Generalized Low-Rank Matrix Contextual Bandits with Graph Information, https://arxiv.org/abs/2507.17528
- Gabriel J. Perin, Runjin Chen, Xuxi Chen, Nina S. T. Hirata, Zhangyang Wang, Junyuan Hong, 23 Jul 2025, LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning, https://arxiv.org/abs/2506.15606
- Etienne Zeudong, Elsa Cardoso-Bihlo and Alex Bihlo, 24 Jul 2025, Low-rank adaptive physics-informed HyperDeepONets for solving differential equations, https://arxiv.org/abs/2507.18346
- Le-Trung Nguyen, Ael Quelennec, Van-Tam Nguyen, Enzo Tartaglione, 24 Jul 2025, Beyond Low-rank Decomposition: A Shortcut Approach for Efficient On-Device Learning, https://arxiv.org/abs/2505.05086
- Constantin Philippenko, Kevin Scaman, Laurent Massouli\'e, 21 Jul 2025, In-depth Analysis of Low-rank Matrix Factorisation in a Federated Setting, https://arxiv.org/abs/2409.08771
- Jinyuan Feng and Zhiqiang Pu and Tianyi Hu and Dongmin Li and Xiaolin Ai and Huimu Wang, 21 Jul 2025, OMoE: Diversifying Mixture of Low-Rank Adaptation by Orthogonal Finetuning, https://arxiv.org/abs/2501.10062
- Chuyan Chen, Yutong He, Pengrui Li, Weichen Jia, Kun Yuan, 20 Jul 2025, Greedy Low-Rank Gradient Compression for Distributed Learning with Convergence Guarantees, https://arxiv.org/abs/2507.08784
- Sachin Garg, Micha{\l} Derezi\'nski, 19 Jul 2025, Faster Low-Rank Approximation and Kernel Ridge Regression via the Block-Nystr\"om Method, https://arxiv.org/abs/2506.17556
- Shiwei Li, Xiandi Luo, Haozhao Wang, Xing Tang, Ziqiang Cui, Dugang Liu, Yuhua Li, Xiuqiang He, Ruixuan Li, 9 Aug 2025, BoRA: Towards More Expressive Low-Rank Adaptation with Block Diversity, https://arxiv.org/abs/2508.06953
- Nairouz Mrabah, Nicolas Richet, Ismail Ben Ayed and \'Eric Granger, 11 Aug 2025, Sparsity Outperforms Low-Rank Projections in Few-Shot Adaptation, https://arxiv.org/abs/2504.12436
- Shishir Muralidhara, Didier Stricker, Ren\'e Schuster, 26 Jul 2025, CLoRA: Parameter-Efficient Continual Learning with Low-Rank Adaptation, https://arxiv.org/abs/2507.19887
- Yue Zhu, Haiwen Diao, Shang Gao, Jiazuo Yu, Jiawen Zhu, Yunzhi Zhuge, Shuai Hao, Xu Jia, Lu Zhang, Ying Zhang, Huchuan Lu, 28 Jul 2025, Regularizing Subspace Redundancy of Low-Rank Adaptation, https://arxiv.org/abs/2507.20745
- Zhan Zhuang, Xiequn Wang, Wei Li, Yulong Zhang, Qiushi Huang, Shuhao Chen, Xuehao Wang, Yanbin Wei, Yuhe Nie, Kede Ma, Yu Zhang, Ying Wei, 27 Jul 2025, Come Together, But Not Right Now: A Progressive Strategy to Boost Low-Rank Adaptation, https://arxiv.org/abs/2506.05713
- Gin\'es Carreto Pic\'on, Illia Oleksiienko, Lukas Hedegaard, Arian Bakhtiarnia, Alexandros Iosifidis, 28 Jul 2025, Continual Low-Rank Scaled Dot-product Attention, https://arxiv.org/abs/2412.03214
- Zerui Tao, Yuhta Takida, Naoki Murata, Qibin Zhao, Yuki Mitsufuji, 31 Jul 2025, Transformed Low-rank Adaptation via Tensor Decomposition and Its Applications to Text-to-image Models, https://arxiv.org/abs/2501.08727
- Zishan Shao, Yixiao Wang, Qinsi Wang, Ting Jiang, Zhixu Du, Hancheng Ye, Danyang Zhuo, Yiran Chen, and Hai Li, 2 Aug 2025, FlashSVD: Memory-Efficient Inference with Streaming for Low-Rank Models, https://arxiv.org/abs/2508.01506
- Jiaxi Li, Lu Yin, Li Shen, Jinjin Xu, Liwu Xu, Tianjin Huang, Wenwu Wang, Shiwei Liu, Xilu Wang, 4 Aug 2025, LOST: Low-rank and Sparse Pre-training for Large Language Models, https://arxiv.org/abs/2508.02668
- Peijia Qin, Ruiyi Zhang, Pengtao Xie, 3 Aug 2025, BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation, https://arxiv.org/abs/2410.09758
- Ayan Sengupta, Vaibhav Seth, Arinjay Pathak, Aastha Verma, Natraj Raman, Sriram Gopalakrishnan, Niladri Chatterjee, Tanmoy Chakraborty, 3 Aug 2025, Robust and Efficient Fine-tuning of LLMs with Bayesian Reparameterization of Low-Rank Adaptation, https://arxiv.org/abs/2411.04358
- Juzheng Zhang, Jiacheng You, Ashwinee Panda, Tom Goldstein, 2 Aug 2025, LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation, https://arxiv.org/abs/2504.07448
- Wenwu Gong and Lili Yang, 4 Aug 2025, LRTuckerRep: Low-rank Tucker Representation Model for Multi-dimensional Data Completion, https://arxiv.org/abs/2508.03755
- Igor Sokolov, Abdurakhmon Sadiev, Yury Demidovich, Fawaz S Al-Qahtani, and Peter Richt\'arik, 5 Aug 2025, Bernoulli-LoRA: A Theoretical Framework for Randomized Low-Rank Adaptation, https://arxiv.org/abs/2508.03820
- Yang Li, Daniel Agyei Asante, Changsheng Zhao, Ernie Chang, Yangyang Shi, Vikas Chandra, 6 Aug 2025, Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications, https://arxiv.org/abs/2405.15877
- Sajjad Ghiasvand and Haniyeh Ehsani Oskouie and Mahnoosh Alizadeh and Ramtin Pedarsani, 12 Aug 2025, Few-Shot Adversarial Low-Rank Fine-Tuning of Vision-Language Models, https://arxiv.org/abs/2505.15130
- Jialin Zhao, Yingtao Zhang, Carlo Vittorio Cannistraci, 13 Aug 2025, Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models, https://arxiv.org/abs/2501.19090
- Mohammad Mozaffari, Amir Yazdanbakhsh, Maryam Mehri Dehnavi, 14 Aug 2025, SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression, https://arxiv.org/abs/2410.09615
- Vedant Puri, Aditya Joglekar, Kevin Ferguson, Yu-hsuan Chen, Yongjie Jessica Zhang, Levent Burak Kara, 18 Aug 2025, FLARE: Fast Low-rank Attention Routing Engine, https://arxiv.org/abs/2508.12594
- Shiwei Li, Xiandi Luo, Haozhao Wang, Xing Tang, Shijie Xu, Weihong Luo, Yuhua Li, Xiuqiang He, Ruixuan Li, 17 Aug 2025, The Panaceas for Improving Low-Rank Decomposition in Communication-Efficient Federated Learning, https://arxiv.org/abs/2505.23176
- Liyi Zhang, Jake Snell, Thomas L. Griffiths, 19 Aug 2025, Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models, https://arxiv.org/abs/2508.14285
- Klaudia Ba{\l}azy, Mohammadreza Banaei, Karl Aberer, Jacek Tabor, 19 Aug 2025, LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters, https://arxiv.org/abs/2405.17604
- Ilja Kuzborskij, Yasin Abbasi Yadkori, 20 Aug 2025, Low-rank bias, weight decay, and model merging in neural networks, https://arxiv.org/abs/2502.17340
- Yajie Zhou and Xiaoyi Pang and Zhibo Wang, 20 Aug 2025, AFLoRA: Adaptive Federated Fine-Tuning of Large Language Models with Resource-Aware Low-Rank Adaption, https://arxiv.org/abs/2505.24773
- Jacob Aguirre, Diego Cifuentes, Vincent Guigues, Renato D.C. Monteiro, Victor Hugo Nascimento, Arnesh Sujanani, 21 Aug 2025, A User Manual for cuHALLaR: A GPU Accelerated Low-Rank Semidefinite Programming Solver, https://arxiv.org/abs/2508.15951
- Sajjad Ghiasvand, Mahnoosh Alizadeh, Ramtin Pedarsani, 21 Aug 2025, Decentralized Low-Rank Fine-Tuning of Large Language Models, https://arxiv.org/abs/2501.15361
- Muchammad Daniyal Kautsar, Afra Majida Hariono, Widyawan, Syukron Abu Ishaq Alfarozi and Kuntpong Wararatpanya, 21 Aug 2025, CALR: Corrective Adaptive Low-Rank Decomposition for Efficient Large Language Model Layer Compression, https://arxiv.org/abs/2508.16680
- Haojie Zhang, 24 Aug 2025, DropLoRA: Sparse Low-Rank Adaptation for Parameter-Efficient Fine-Tuning, https://arxiv.org/abs/2508.17337
- Emanuele Zangrando, Piero Deidda, Simone Brugiapaglia, Nicola Guglielmi, Francesco Tudisco, 23 Aug 2025, Provable Emergence of Deep Neural Collapse and Low-Rank Bias in $L^2$-Regularized Nonlinear Networks, https://arxiv.org/abs/2402.03991
- Keisuke Kamahori, Jungo Kasai, Noriyuki Kojima, Baris Kasikci, 23 Aug 2025, LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation, https://arxiv.org/abs/2502.20583
- Bastien Dubail, Stefan Stojanovic and Alexandre Prouti\`ere, 5 Sep 2025, Shift Before You Learn: Enabling Low-Rank Representations in Reinforcement Learning, https://arxiv.org/abs/2509.05193
- Ricardo Borsoi, Konstantin Usevich, Marianne Clausel, 25 Aug 2025, Low-Rank Tensor Decompositions for the Theory of Neural Networks, https://arxiv.org/abs/2508.18408
- Tatyana Matveeva, Aleksandr Katrutsa, Evgeny Frolov, 28 Aug 2025, Dynamic Low-rank Approximation of Full-Matrix Preconditioner for Training Generalized Linear Models, https://arxiv.org/abs/2508.21106
- Jessica Liang, Anirudh Bharadwaj, 29 Aug 2025, QR-LoRA: QR-Based Low-Rank Adaptation for Efficient Fine-Tuning of Large Language Models, https://arxiv.org/abs/2508.21810
- Liangjing Shao, Benshuang Chen, Chenkang Du, Xueli Liu, Xinrong Chen, 1 Sep 2025, Generalizable Self-supervised Monocular Depth Estimation with Mixture of Low-Rank Experts for Diverse Endoscopic Scenes, https://arxiv.org/abs/2509.01206
- Tarhib Al Azad and Shahana Ibrahim, 8 Sep 2025, Tackling the Noisy Elephant in the Room: Label Noise-robust Out-of-Distribution Detection via Loss Correction and Low-rank Decomposition, https://arxiv.org/abs/2509.06918
- Himanshu Thakur, Eshani Agrawal, Smruthi Mukund, 18 Aug 2025, Personas within Parameters: Fine-Tuning Small Language Models with Low-Rank Adapters to Mimic User Behaviors, https://arxiv.org/abs/2509.09689
- Chen Li, Elena Ferro, Corey Lammie, Manuel Le Gallo, Irem Boybat, Bipin Rajendran, 11 Sep 2025, Efficient transformer adaptation for analog in-memory computing via low-rank adapters, https://arxiv.org/abs/2411.17367
- Bhoomit Vasani, Jack FitzGerald, Anjie Fang, Sushmit Vaish, 13 Sep 2025, PHLoRA: data-free Post-hoc Low-Rank Adapter extraction from full-rank checkpoint, https://arxiv.org/abs/2509.10971
- Chuan He, Zhanwang Deng, Zhaosong Lu, 15 Sep 2025, Low-rank Orthogonalization for Large-scale Matrix Optimization with Applications to Foundation Model Training, https://arxiv.org/abs/2509.11983
- Shengping Xie, Chuyan Chen, Kun Yuan, 14 Sep 2025, From PowerSGD to PowerSGD+: Low-Rank Gradient Compression for Distributed Optimization with Convergence Guarantees, https://arxiv.org/abs/2509.11254
- Cooper Doyle, 15 Sep 2025, Low-rank variational dropout: Uncertainty and rank selection in adapters, https://arxiv.org/abs/2506.22809
- Yang Xu, Junpeng Li, Changchun Hua, and Yana Yang, 18 Sep 2025, Structure-Preserving Margin Distribution Learning for High-Order Tensor Data with Low-Rank Decomposition, https://arxiv.org/abs/2509.14577
- Zhiyuan Xue, Ben Yang, Xuetao Zhang, Fei Wang and Zhiping Lin, 18 Sep 2025, One-step Multi-view Clustering With Adaptive Low-rank Anchor-graph Learning, https://arxiv.org/abs/2509.14724
- Andrei Chertkov, Artem Basharin, Mikhail Saygin, Evgeny Frolov, Stanislav Straupe, Ivan Oseledets, 18 Sep 2025, Low-rank surrogate modeling and stochastic zero-order optimization for training of neural networks with black-box layers, https://arxiv.org/abs/2509.15113
- Xin Liao and Bing Yang and Cai Yu, 10 Sep 2025, A Nonlinear Low-rank Representation Model with Convolutional Neural Network for Imputing Water Quality Data, https://arxiv.org/abs/2506.23629
- Janne Laakkonen, Ivan Kukanov and Ville Hautam\"aki, 17 Sep 2025, Mixture of Low-Rank Adapter Experts in Generalizable Audio Deepfake Detection, https://arxiv.org/abs/2509.13878
- Zhizhong Li, Sina Sajadmanesh, Jingtao Li, Lingjuan Lyu, 2 Oct 2025, StelLA: Subspace Learning in Low-rank Adaptation using Stiefel Manifold, https://arxiv.org/abs/2510.01938
- Le-Tuan Nguyen, Minh-Duong Nguyen, Seon-Geun Jeong, Dung D. Le, Quoc-Viet Pham, 2 Oct 2025, Communication-Efficient and Accurate Approach for Aggregation in Federated Low-Rank Adaptation, https://arxiv.org/abs/2509.26399
- Ziyue Liu, Ruijie Zhang, Zhengyang Wang, Mingsong Yan, Zi Yang, Paul Hovland, Bogdan Nicolae, Franck Cappello, Sui Tang, Zheng Zhang, 1 Oct 2025, CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation, https://arxiv.org/abs/2502.10940
- Andres Fernandez, Felix Dangel, Philipp Hennig, Frank Schneider, 2 Oct 2025, Sketching Low-Rank Plus Diagonal Matrices, https://arxiv.org/abs/2509.23587
- Kaustubh Ponkshe, Raghav Singhal, Eduard Gorbunov, Alexey Tumanov, Samuel Horvath, Praneeth Vepakomma, 2 Oct 2025, Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning, https://arxiv.org/abs/2411.19557
- Chenliang Li, Junyu Leng, Jiaxiang Li, Youbang Sun, Shixiang Chen, Shahin Shahrampour, Alfredo Garcia, 13 Oct 2025, ADARL: Adaptive Low-Rank Structures for Robust Policy Learning under Uncertainty, https://arxiv.org/abs/2510.11899
- Ziqi Zhao and Vivek Sarin, 14 Oct 2025, nuGPR: GPU-Accelerated Gaussian Process Regression with Iterative Algorithms and Low-Rank Approximations, https://arxiv.org/abs/2510.12128
- Tao Yin, Xiaohong Zhang, Jiacheng Zhang, Li Huang, Zhibin Zhang, Yuansong Zeng, Jin Xie and Meng Yan, 14 Oct 2025, MoRA: On-the-fly Molecule-aware Low-Rank Adaptation Framework for LLM-based Multi-Modal Molecular Assistant, https://arxiv.org/abs/2510.12245
- Xin Yu, Cong Xie, Ziyu Zhao, Tiantian Fan, Lingzhou Xue, Zhi Zhang, 30 Sep 2025, PrunedLoRA: Robust Gradient-Based structured pruning for Low-rank Adaptation in Fine-tuning, https://arxiv.org/abs/2510.00192
- Federico Cinus, Yuko Kuroki, Atsushi Miyauchi, Francesco Bonchi, 1 Oct 2025, Online Minimization of Polarization and Disagreement via Low-Rank Matrix Bandits, https://arxiv.org/abs/2510.00803
- Vladimir Bogachev, Vladimir Aletov, Alexander Molozhavenko, Denis Bobkov, Vera Soboleva, Aibek Alanov, Maxim Rakhuba, 1 Oct 2025, LoRA meets Riemannion: Muon Optimizer for Parametrization-independent Low-Rank Adapters, https://arxiv.org/abs/2507.12142
- Axel Marmoret, Reda Bensaid, Jonathan Lys, Vincent Gripon, Fran\c{c}ois Leduc-Primeau, 22 Sep 2025, TensLoRA: Tensor Alternatives for Low-Rank Adaptation, https://arxiv.org/abs/2509.19391
- Prabhat Karmakar, Sayan Gupta, Ilaksh Adlakha, 24 Sep 2025, Extended Low-Rank Approximation Accelerates Learning of Elastic Response in Heterogeneous Materials, https://arxiv.org/abs/2509.20276
- Babak Barazandeh, Subhabrata Majumdar, Om Rajyaguru, George Michailidis, 23 Sep 2025, Localized LoRA: A Structured Low-Rank Approximation for Efficient Fine-Tuning, https://arxiv.org/abs/2506.00236
- Huancheng Chen and Jingtao Li and Weiming Zhuang and Chen Chen and Lingjuan Lyu, 24 Sep 2025, Replay-Free Continual Low-Rank Adaptation with Dynamic Memory, https://arxiv.org/abs/2411.00623
- Yilang Zhang, Xiaodong Yang, Yiwei Cai, Georgios B. Giannakis, 27 Oct 2025, ScaLoRA: Optimally Scaled Low-Rank Adaptation for Efficient High-Rank Fine-Tuning, https://arxiv.org/abs/2510.23818
- Qingyue Zhang, Chang Chu, Tianren Peng, Qi Li, Xiangyang Luo, Zhihao Jiang, Shao-Lun Huang, 28 Oct 2025, LoRA-DA: Data-Aware Initialization for Low-Rank Adaptation via Asymptotic Analysis, https://arxiv.org/abs/2510.24561
- Lukas Schynol, Marius Pesavento, 28 Oct 2025, Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling, https://arxiv.org/abs/2409.11529
- Sisipho Hamlomo, Marcellin Atemkeng, 28 Oct 2025, Clustering-Based Low-Rank Matrix Approximation for Medical Image Compression, https://arxiv.org/abs/2505.08256
- Jacob L. Block, Sundararajan Srinivasan, Liam Collins, Aryan Mokhtari, Sanjay Shakkottai, 22 Oct 2025, Provable Meta-Learning with Low-Rank Adaptations, https://arxiv.org/abs/2410.22264
- Zhuxuanzi Wang, Mingqiao Mo, Xi Xiao, Chen Liu, Chenrui Ma, Yunbei Zhang, Xiao Wang, Smita Krishnaswamy, Tianyang Wang, 11 Oct 2025, CTR-LoRA: Curvature-Aware and Trust-Region Guided Low-Rank Adaptation for Large Language Models, https://arxiv.org/abs/2510.15962
- Yutong Wang, Haiyu Wang, Sai Qian Zhang, 18 Oct 2025, QSVD: Efficient Low-rank Approximation for Unified Query-Key-Value Weight Compression in Low-Precision Vision-Language Models, https://arxiv.org/abs/2510.16292
- Rui Pan, Yang Luo, Yuxing Liu, Yang You, Tong Zhang, 20 Oct 2025, Unbiased Gradient Low-Rank Projection, https://arxiv.org/abs/2510.17802
- Aniello Panariello, Daniel Marczak, Simone Magistri, Angelo Porrello, Bart{\l}omiej Twardowski, Andrew D. Bagdanov, Simone Calderara, Joost van de Weijer, 20 Oct 2025, Accurate and Efficient Low-Rank Model Merging in Core Space, https://arxiv.org/abs/2509.17786
- Ryan Cory-Wright, Jean Pauphilet, 17 Oct 2025, Improved Approximation Algorithms for Low-Rank Problems Using Semidefinite Optimization, https://arxiv.org/abs/2501.02942
- Jiahao Zhang, Shiheng Zhang, Guang Lin, 19 Sep 2025, Low-Rank Adaptation of Evolutionary Deep Neural Networks for Efficient Learning of Time-Dependent PDEs, https://arxiv.org/abs/2509.16395
- Steffen Schotth\"ofer, H. Lexie Yang, Stefan Schnake, 21 Sep 2025, Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks, https://arxiv.org/abs/2505.08022
- Yilang Zhang, Bingcong Li, Georgios B. Giannakis, 21 Sep 2025, RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models, https://arxiv.org/abs/2505.18877
- Steffen Schotth\"ofer, Timon Klein, Jonas Kusch, 21 Sep 2025, A geometric framework for momentum-based optimizers for low-rank training, https://arxiv.org/abs/2506.17475
- Shiwei Li, Xiandi Luo, Haozhao Wang, Xing Tang, Ziqiang Cui, Dugang Liu, Yuhua Li, Xiuqiang He, Ruixuan Li, 27 Oct 2025, Beyond Higher Rank: Token-wise Input-Output Projections for Efficient Low-Rank Adaptation, https://arxiv.org/abs/2510.23123
- Haochen Zhang, Junze Yin, Guanchu Wang, Zirui Liu, Lin F. Yang, Tianyi Zhang, Anshumali Shrivastava, Vladimir Braverman, 25 Oct 2025, Breaking the Frozen Subspace: Importance Sampling for Low-Rank Optimization in LLM Pretraining, https://arxiv.org/abs/2502.05790
- Ruijie Zhang, Ziyue Liu, Zhengyang Wang, Zheng Zhang, 24 Oct 2025, LaX: Boosting Low-Rank Training of Foundation Models via Latent Crossing, https://arxiv.org/abs/2505.21732
- Faraz Tahmasebi, Michael Pelluer, Hyoukjun Kwon, 15 Oct 2025, D-com: Accelerating Iterative Processing to Enable Low-rank Decomposition of Activations, https://arxiv.org/abs/2510.13147
- Krishu K Thapa and Reet Barik and Krishna Teja Chitty-Venkata and Murali Emani and Venkatram Vishwanath, 25 Sep 2025, PreLoRA: Hybrid Pre-training of Vision Transformers with Full Training and Low-Rank Adapters, https://arxiv.org/abs/2509.21619
- Yuxuan Zhu, David H. Yang, Mohammad Mohammadi Amiri, Keerthiram Murugesan, Tejaswini Pedapati, Pin-Yu Chen, 25 Sep 2025, OjaKV: Context-Aware Online Low-Rank KV Cache Compression with Oja's Rule, https://arxiv.org/abs/2509.21623
- Guanzhi Deng, Mingyang Liu, Dapeng Wu, Yinqiao Li, Linqi Song, 26 Sep 2025, Enhancing Low-Rank Adaptation with Structured Nonlinear Transformations, https://arxiv.org/abs/2509.21870
- Ionut-Vlad Modoranu, Mher Safaryan, Erik Schultheis, Max Ryabinin, Artem Chumachenko, Dan Alistarh, 26 Sep 2025, FFT-based Dynamic Subspace Selection for Low-Rank Adaptive Optimization of Large Language Models, https://arxiv.org/abs/2505.17967
- Haizhou Shi, Yibin Wang, Ligong Han, Huan Zhang, Hao Wang, 26 Sep 2025, Training-Free Bayesianization for Low-Rank Adapters of Large Language Models, https://arxiv.org/abs/2412.05723
- Peng Tang, Xiaoxiao Yan, Xiaobin Hu, Yuning Cui, Donghao Luo, Jiangning Zhang, Pengcheng Xu, Jinlong Peng, Qingdong He, Feiyue Huang, Song Xue, Tobias Lasser, 21 Oct 2025, ShortcutBreaker: Low-Rank Noisy Bottleneck with Global Perturbation Attention for Multi-Class Unsupervised Anomaly Detection, https://arxiv.org/abs/2510.18342
- Shihao Ji, Zihui Song, 19 Oct 2025, L-MoE: End-to-End Training of a Lightweight Mixture of Low-Rank Adaptation Experts, https://arxiv.org/abs/2510.17898
- Enes Yavuz Ugan, Ngoc-Quan Pham, Alexander Waibel, 21 Oct 2025, Bayesian Low-Rank Factorization for Robust Model Adaptation, https://arxiv.org/abs/2510.18723
- Niclas Pokel, Pehu\'en Moure, Roman Boehringer, Shih-Chii Liu, Yingqiang Gao, 23 Sep 2025, Variational Low-Rank Adaptation for Personalized Impaired Speech Recognition, https://arxiv.org/abs/2509.20397
- Tristan S.W. Stevens, Ois\'in Nolan, Jean-Luc Robert, Ruud J.G. van Sloun, 25 Sep 2025, Nuclear Diffusion Models for Low-Rank Background Suppression in Videos, https://arxiv.org/abs/2509.20886
- Ashkan Shahbazi, Chayne Thrash, Yikun Bai, Keaton Hamm, Navid NaderiAlizadeh, Soheil Kolouri, 27 Sep 2025, LOTFormer: Doubly-Stochastic Linear Attention via Low-Rank Optimal Transport, https://arxiv.org/abs/2509.23436
- Jiang-Xin Shi, Wen-Da Wei, Jin-Fei Qi, Xuanyu Chen, Tong Wei, Yu-Feng Li, 27 Sep 2025, Memory-Efficient Fine-Tuning via Low-Rank Activation Compression, https://arxiv.org/abs/2509.23472
- David Gonz\'alez Mart\'inez, 29 Sep 2025, BALF: Budgeted Activation-Aware Low-Rank Factorization for Fine-Tuning-Free Model Compression, https://arxiv.org/abs/2509.25136
- Zelin Liu, Sicheng Dong, Bocheng Li, Yixuan Yang, Jiacheng Ruan, Chenxu Zhou, Suncheng Xiang, 29 Sep 2025, BALR-SAM: Boundary-Aware Low-Rank Adaptation of SAM for Resource-Efficient Medical Image Segmentation, https://arxiv.org/abs/2509.24204
- Ruigang Wang, Krishnamurthy Dvijotham, Ian R. Manchester, 29 Sep 2025, Norm-Bounded Low-Rank Adaptation, https://arxiv.org/abs/2501.19050
- Zihuan Qiu, Yi Xu, Chiyuan He, Fanman Meng, Linfeng Xu, Qingbo Wu, Hongliang Li, 29 Sep 2025, MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging, https://arxiv.org/abs/2505.11883
- Xin Yu, Yujia Wang, Jinghui Chen, Lingzhou Xue, 27 Sep 2025, AltLoRA: Towards Better Gradient Approximation in Low-Rank Adaptation with Alternating Projections, https://arxiv.org/abs/2505.12455
- Jian Liang, Wenke Huang, Xianda Guo, Guancheng Wan, Bo Du, Mang Ye, 29 Sep 2025, ThanoRA: Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation, https://arxiv.org/abs/2505.18640
- Xianglong Yan, Zhiteng Li, Tianao Zhang, Haotong Qin, Linghe Kong, Yulun Zhang and Xiaokang Yang, 27 Sep 2025, ReCalKV: Low-Rank KV Cache Compression via Head Reordering and Offline Calibration, https://arxiv.org/abs/2505.24357
- Anh Truong, Ahmed H. Mahmoud, Mina Konakovi\'c Lukovi\'c, Justin Solomon, 16 Oct 2025, Low-Rank Adaptation of Neural Fields, https://arxiv.org/abs/2504.15933
- Yongfu Xue, 4 Oct 2025, Optimizing Fine-Tuning through Advanced Initialization Strategies for Low-Rank Adaptation, https://arxiv.org/abs/2510.03731
- Nghiem T. Diep, Dung Le, Tuan Truong, Tan Dinh, Huy Nguyen, Nhat Ho, 5 Oct 2025, HoRA: Cross-Head Low-Rank Adaptation with Joint Hypernetworks, https://arxiv.org/abs/2510.04295
- Nghiem T. Diep, Hien Dang, Tuan Truong, Tan Dinh, Huy Nguyen, Nhat Ho, 5 Oct 2025, DoRAN: Stabilizing Weight-Decomposed Low-Rank Adaptation via Noise Injection and Auxiliary Networks, https://arxiv.org/abs/2510.04331
- Kaito Takanami, Takashi Takahashi, and Yoshiyuki Kabashima, 6 Oct 2025, Learning Linear Regression with Low-Rank Tasks in-Context, https://arxiv.org/abs/2510.04548
- Connall Garrod and Jonathan P. Keating, 5 Oct 2025, The Persistence of Neural Collapse Despite Low-Rank Bias, https://arxiv.org/abs/2410.23169
- Devleena Das, Rajeev Patwari, Ashish Sirasao, 6 Oct 2025, Recover-LoRA: Data-Free Accuracy Recovery of Degraded Language Models via Low-Rank Adaptation, https://arxiv.org/abs/2510.08600
- Yu-Chen Lu, Chong-Yan Chen, Chi-Chih Chang, Yu-Fang Hu, Kai-Chiang Wu, 10 Oct 2025, FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference, https://arxiv.org/abs/2510.09332
- Xiequn Wang, Zhan Zhuang, Yu Zhang, 24 Oct 2025, PLAN: Proactive Low-Rank Allocation for Continual Learning, https://arxiv.org/abs/2510.21188
- Wei Shen, Zhang Yaxiang, Minhui Huang, Mengfan Xu, Jiawei Zhang, Cong Shen, 12 Oct 2025, MLorc: Momentum Low-rank Compression for Memory Efficient Large Language Model Adaptation, https://arxiv.org/abs/2506.01897
- Jitai Hao, Qiang Huang, Hao Liu, Xinyan Xiao, Zhaochun Ren, Jun Yu, 11 Oct 2025, A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone, https://arxiv.org/abs/2505.12781
- Xiaoshuang Ji, Zhendong Zhao, Xiaoyan Gu, Xiaojun Chen, Xin Zhao and Zeyao Liu, 9 Oct 2025, AILoRA: Function-Aware Asymmetric Initialization for Low-Rank Adaptation of Large Language Models, https://arxiv.org/abs/2510.08034
- Timon Klein, Piotr Minakowski and Sebastian Sager, 9 Oct 2025, Mitigating Subject Dependency in EEG Decoding with Subject-Specific Low-Rank Adapters, https://arxiv.org/abs/2510.08059
- Debsurya De, Dmitriy Kunisky, 9 Oct 2025, Computational and statistical lower bounds for low-rank estimation under general inhomogeneous noise, https://arxiv.org/abs/2510.08541
- Xiao Han, Zimo Zhao, Wanyu Wang, Maolin Wang, Zitao Liu, Yi Chang, Xiangyu Zhao, 23 Sep 2025, Data Efficient Adaptation in Large Language Models via Continuous Low-Rank Fine-Tuning, https://arxiv.org/abs/2509.18942
- Boao Kong, Junzhu Liang, Yuxi Liu, Renjia Deng, Kun Yuan, 23 Sep 2025, CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure, https://arxiv.org/abs/2509.18993
- Yu Chen, Yifei Han, Long Zhang, Yue Du, Bin Li, 23 Sep 2025, TsqLoRA: Towards Sensitivity and Quality Low-Rank Adaptation for Efficient Fine-Tuning, https://arxiv.org/abs/2509.18585
- Evgenia Shustova, Marina Sheshukova, Sergey Samsonov and Evgeny Frolov, 22 Oct 2025, Scalable LinUCB: Low-Rank Design Matrix Updates for Recommenders with Large Action Spaces, https://arxiv.org/abs/2510.19349
- Jin Li and Zhebo Wang and Tianliang Lu and Mohan Li and Wenpeng Xing and Meng Han, 19 Sep 2025, Spectral Logit Sculpting: Adaptive Low-Rank Logit Transformation for Controlled Text Generation, https://arxiv.org/abs/2509.25204
- Harry Dong, Bilge Acun, Beidi Chen, Yuejie Chi, 30 Sep 2025, Scalable LLM Math Reasoning Acceleration with Low-rank Distillation, https://arxiv.org/abs/2505.07861
- Luke McDermott, Robert W. Heath Jr., Rahul Parhi, 30 Sep 2025, LoLA: Low-Rank Linear Attention With Sparse Caching, https://arxiv.org/abs/2505.23666
- Simon Segert and Nathan Wycoff, 6 Oct 2025, A Probabilistic Basis for Low-Rank Matrix Learning, https://arxiv.org/abs/2510.05447
- Sam Sartor and Pieter Peers, 7 Oct 2025, Teamwork: Collaborative Diffusion with Low-rank Coordination and Adaptation, https://arxiv.org/abs/2510.05532
- Ryan Solgi, Parsa Madinei, Jiayi Tian, Rupak Swaminathan, Jing Liu, Nathan Susanj, Zheng Zhang, 7 Oct 2025, Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM, https://arxiv.org/abs/2510.05544
- Munsif Ali, Leonardo Rossi, Massimo Bertozzi, 13 Oct 2025, CoLoR-GAN: Continual Few-Shot Learning with Low-Rank Adaptation in Generative Adversarial Networks, https://arxiv.org/abs/2510.13869
- Mohammadsajad Alipour, Mohammad Mohammadi Amiri, 15 Oct 2025, Towards Reversible Model Merging For Low-rank Weights, https://arxiv.org/abs/2510.14163
AI Books from Aussie AI
|
The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
Get your copy from Amazon: The Sweetest Lesson |
|
RAG Optimization: Accurate and Efficient LLM Applications:
new book on RAG architectures:
Get your copy from Amazon: RAG Optimization |
|
Generative AI Applications book:
Get your copy from Amazon: Generative AI Applications |
|
Generative AI programming book:
Get your copy from Amazon: Generative AI in C++ |
|
CUDA C++ Optimization book:
Get your copy from Amazon: CUDA C++ Optimization |
|
CUDA C++ Debugging book:
Get your copy from Amazon: CUDA C++ Debugging |
More AI Research
Read more about:
- Sparsity
- Magnitude pruning
- Matrix algebra
- Layer pruning
- Token pruning
- Attention head pruning
- Embeddings pruning
- FFN pruning
- Shallow decoder architecture
- Normalization pruning
- Length pruning
- Width pruning
- Channel pruning
- « Research Home