Aussie AI

Parameter-Efficient Fine-Tuning (PEFT)

  • Last Updated 17 November, 2025
  • by David Spuler, Ph.D.

Parameter-Efficient Fine-Tuning (PEFT) is fine-tuning that's efficient with parameters. Instead of updating all of the model's parameters, which is slow and inefficient, only a subset of parameters is updated. The rest of the model parameters are "frozen" during the fine-tuning procedure.

Various types of PEFT have been examined, such as:

The alternatives to using PEFT to train additional intelligence into a model include:

LoRA

The idea behind LoRA is to use "low-rank" matrices, which have a smaller size, and thus are much less costly to fine-tuning. These matrices can be multiplied together to create data than can be combined with the original model.

Multi-LoRA

The use of multiple LoRA adapters got a boost when Apple chose this method for its Apple Intelligence platform. Several other platforms use multi-LoRA as an efficiency gain for both training and inference.

Research papers on multi-LoRA include:

LoRA Inference Optimizations

The popularity of LoRA as an efficient training method has also spawned research on maximizing its inference efficiency. Some of the loading and unloading of LoRA adapters can be quite expensive, and methods of optimizing multi-LORA platforms have seen various research papers.

Research papers on LoRA and multi-LoRA inference optimization include:

  • Lequn Chen, Zihao Ye, Yongji Wu, Danyang Zhuo, Luis Ceze, Arvind Krishnamurthy, 28 Oct 2023, Punica: Multi-Tenant LoRA Serving https://arxiv.org/abs/2310.18547 Code: https://github.com/punica-ai/punica
  • Jingwei Xu, Junyu Lai, Yunpeng Huang, 24 May 2024 (v2), MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models, https://arxiv.org/abs/2405.13053
  • Tuna Han Salih Meral, Enis Simsar, Federico Tombari, Pinar Yanardag, 28 Mar 2024, CLoRA: A Contrastive Approach to Compose Multiple LoRA Models, https://arxiv.org/abs/2403.19776
  • Suyi Li, Hanfeng Lu, Tianyuan Wu, Minchen Yu, Qizhen Weng, Xusheng Chen, Yizhou Shan, Binhang Yuan, Wei Wang, 20 Jan 2024, CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM Inference, https://arxiv.org/abs/2401.11240 (Multi-LoRA inference where it starts running prefill computations in the CPU while loading the LoRA weights into the GPU.)
  • Rui Kong, Qiyang Li, Xinyu Fang, Qingtian Feng, Qingfeng He, Yazhu Dong, Weijun Wang, Yuanchun Li, Linghe Kong, Yunxin Liu, 28 May 2024, LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via System-Algorithm Co-design, https://arxiv.org/abs/2405.17741
  • Ying Sheng, Shiyi Cao, Dacheng Li, Coleman Hooper, Nicholas Lee, Shuo Yang, Christopher Chou, Banghua Zhu, Lianmin Zheng, Kurt Keutzer, Joseph E. Gonzalez, Ion Stoica, 5 Jun 2024 (v3), S-LoRA: Serving Thousands of Concurrent LoRA Adapters, https://arxiv.org/abs/2311.03285 Code: https://github.com/S-LoRA/S-LoRA
  • Chen, Lequn, 2024, Multi-tenant Machine Learning Model Serving Systems on GPU Clusters, PhD Thesis, University of Washington, https://digital.lib.washington.edu/researchworks/items/13e14599-b4ee-4fbb-86bf-e58a4118d0f9
  • Bingyang Wu, Ruidong Zhu, and Zili Zhang, Peng Sun, Shanghai AI Lab; Xuanzhe Liu, Xin Jin, 2024, dLoRA: Dynamically Orchestrating Requests and Adapters for LoRA LLM Serving, https://www.usenix.org/conference/osdi24/presentation/wu-bingyang
  • Jing Liu, Ruihao Gong, Mingyang Zhang, Yefei He, Jianfei Cai, Bohan Zhuang, 13 Jun 2024, ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models, https://arxiv.org/abs/2406.09041 (How to load multiple experts for MoE in a memory-efficient way using mixed-precision quantization based on identifying the few salient channels that need higher precision, as an alternative to multi-LoRA.)
  • Yuren Mao, Yuhang Ge, Yijiang Fan, Wenyi Xu, Yu Mi, Zhonghao Hu, Yunjun Gao 12 Aug 2024 (v3), A Survey on LoRA of Large Language Models, https://arxiv.org/abs/2407.11046 https://github.com/ZJU-LLMs/Awesome-LoRAs.git
  • Yuxuan Zhang, Ruizhe Li, 2 Oct 2024, DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models, https://arxiv.org/abs/2410.01497 https://github.com/MeCuping/DLP-LoRA (Merging multiple LoRA adapters for parallel inference.)
  • Liang Mi, Weijun Wang, Wenming Tu, Qingfeng He, Rui Kong, Xinyu Fang, Yazhu Dong, Yikang Zhang, Yunchun Li, Meng Li, Haipeng Dai, Guihai Chen, Yunxin Liu, 1 Nov 2024, V-LoRA: An Efficient and Flexible System Boosts Vision Applications with LoRA LMM, https://arxiv.org/abs/2411.00915
  • Nikoleta Iliakopoulou, Jovan Stojkovic, Chloe Alverti, Tianyin Xu, Hubertus Franke, Josep Torrellas, 24 Nov 2024, Chameleon: Adaptive Caching and Scheduling for Many-Adapter LLM Inference Environments, https://arxiv.org/abs/2411.17741
  • Jiaxuan Chen. 2024. Comparative Analysis and Optimization of LoRA Adapter Co-serving for Large Language Models. In Proceedings of the 25th International Middleware Conference: Demos, Posters and Doctoral Symposium (Middleware '24). Association for Computing Machinery, New York, NY, USA, 27–28. https://doi.org/10.1145/3704440.3704777 https://dl.acm.org/doi/abs/10.1145/3704440.3704777 https://dl.acm.org/doi/pdf/10.1145/3704440.3704777 (Serving multiple LoRA adapters while maintaining a single backbone LLM model in memory.)
  • https://arxiv.org/abs/2505.14468
  • Ranran Zhen, Juntao Li, Yixin Ji, Zhenlin Yang, Tong Liu, Qingrong Xia, Xinyu Duan, Zhefeng Wang, Baoxing Huai, Min Zhang, 28 Apr 2025, Taming the Titans: A Survey of Efficient LLM Inference Serving, https://arxiv.org/abs/2504.19720 (Surver of various inference and serving optimizations, such as parallelism, offloading, scheduling, length prediction, KV cache compression, and prefill-decode phase disaggregation.)

QLoRA

QLoRA is quantized LoRA. This is pretty standard nowadays, with most LoRA adapters using quantization. A lot of the research papers don't use the term "QLoRA" anymore. For example, Apple Intelligence uses QLoRA in its multi-LoRA architecture with 4-bit quantization.

Research papers on QLoRA include:

Prompt Tuning (Extended Vocabulary PEFT)

Prompt tuning is a lengthwise PEFT that creates new tokens to extend the vocabulary, rather than training the parameters for existing tokens. Since the tokens are new, they don't have values, and obviously aren't frozen. The weights for the original tokens are frozen, however, which means most of them. TAs an example, this type of PEFT can be useful when extending the LLM via fine-tuning on a special curated data set, so as to have particular "trigger tokens" to launch integrated tools or perform other advanced capabilities. For example, adding new tokens that indicate a "tool launch" to the vocabulary and only fine-tuning for those ones.

  • IBM, 2024, What is prompt-tuning?, https://research.ibm.com/blog/what-is-ai-prompt-tuning
  • Abhinav Jain, Swarat Chaudhuri, Thomas Reps, Chris Jermaine, 24 May 2024, Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation, https://arxiv.org/abs/2405.15282
  • MohammadAli SadraeiJavaeri, Ehsaneddin Asgari, Alice Carolyn McHardy, Hamid Reza Rabiee, 7 Jun 2024, SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings, https://arxiv.org/abs/2406.05279
  • Martin Wistuba, Prabhu Teja Sivaprasad, Lukas Balles, Giovanni Zappella, 5 Jun 2024, Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need, https://arxiv.org/abs/2406.03216
  • Xuyang Wu, Zhiyuan Peng, Sravanthi Rajanala, Hsin-Tai Wu, Yi Fang, 31 May 2024, Passage-specific Prompt Tuning for Passage Reranking in Question Answering with Large Language Models, https://arxiv.org/abs/2405.20654
  • Wei Zhu, Aaron Xuxiang Tian, Congrui Yin, Yuan Ni, Xiaoling Wang, Guotong Xie, 7 Jun 2024 (v2), IAPT: Instruction-Aware Prompt Tuning for Large Language Models, https://arxiv.org/abs/2405.18203
  • Mengwei Xu, Wangsong Yin, Dongqi Cai, Rongjie Yi, Daliang Xu, Qipeng Wang, Bingyang Wu, Yihao Zhao, Chen Yang, Shihe Wang, Qiyang Zhang, Zhenyan Lu, Li Zhang, Shangguang Wang, Yuanchun Li, Yunxin Liu, Xin Jin, Xuanzhe Liu, 16 Jan 2024, A Survey of Resource-efficient LLM and Multimodal Foundation Models, https://arxiv.org/abs/2401.08092 Project: https://github.com/UbiquitousLearning/Efficient_Foundation_Model_Survey
  • 18 Apr 2024 (v2), The Efficiency Spectrum of Large Language Models: An Algorithmic Survey, Tianyu Ding, Tianyi Chen, Haidong Zhu, Jiachen Jiang, Yiqi Zhong, Jinxin Zhou, Guangzhi Wang, Zhihui Zhu, Ilya Zharkov, Luming Liang, https://arxiv.org/abs/2312.00678
  • M Xu, D Cai, W Yin, S Wang, X Jin, X Liu - ACM Computing Surveys, 2024, Resource-efficient Algorithms and Systems of Foundation Models: A Survey, https://dl.acm.org/doi/pdf/10.1145/3706418
  • Brian Lester, Rami Al-Rfou, and Noah Constant. 2021. The Power of Scale for Parameter-Efficient Prompt Tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics. https://aclanthology.org/2021.emnlp-main.243/
  • Shreyansh Shah, Oct 18, 2023, Prompt Tuning: A Powerful Technique for Adapting LLMs to New Tasks, https://medium.com/@shahshreyansh20/prompt-tuning-a-powerful-technique-for-adapting-llms-to-new-tasks-6d6fd9b83557
  • Data Camp, May 19, 2024, Understanding Prompt Tuning: Enhance Your Language Models with Precision, https://www.datacamp.com/tutorial/understanding-prompt-tuning
  • Sergey Sedov, Sumanth Bharadwaj Hachalli Karanam, Venu Gopal Kadamba, 24 Dec 2024, Exploring Embedding Priors in Prompt-Tuning for Improved Interpretability and Control, https://arxiv.org/abs/2412.18582
  • Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, Jie Tang, 20 Mar 2022 (v3), P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks, Proceedings of the 60th Annual Meeting of the Association of Computational Linguistics, 2022, https://arxiv.org/abs/2110.07602 https://github.com/THUDM/P-tuning-v2 (Extends prompt tuning with extra soft prompt tokens at every layer, not just at the start of the input.)
  • Haowei Zhu, Fangyuan Zhang, Rui Qin, Tianxiang Pan, Junhai Yong, Bin Wang, 24 Dec 2024 (v2), Semantic Hierarchical Prompt Tuning for Parameter-Efficient Fine-Tuning, https://arxiv.org/abs/2412.16956
  • Xiang Lisa Li, Percy Liang, 1 Jan 2021, Prefix-Tuning: Optimizing Continuous Prompts for Generation, https://arxiv.org/abs/2101.00190 (Precursor to prompt tuning.)
  • Andrea Matarazzo, Riccardo Torlone, 3 Jan 2025, A Survey on Large Language Models with some Insights on their Capabilities and Limitations, https://arxiv.org/abs/2501.04040 (Broad survey with many LLM topics covered from history to architectures to optimizations.)
  • Qi Sun, Edoardo Cetin, Yujin Tang, 14 Jan 2025 (v2), Transformer2: Self-adaptive LLMs, https://arxiv.org/abs/2501.06252 (Using a vector to fine-tuning dynamically.)
  • Liu Yang, Ziqian Lin, Kangwook Lee, Dimitris Papailiopoulos, Robert Nowak, 16 Jan 2025, Task Vectors in In-Context Learning: Emergence, Formation, and Benefit, https://arxiv.org/abs/2501.09240
  • Dan Zhang, Tao Feng, Lilong Xue, Yuandong Wang, Yuxiao Dong, Jie Tang, 23 Jan 2025, Parameter-Efficient Fine-Tuning for Foundation Models, https://arxiv.org/abs/2501.13787
  • X Li, C Jiang, Nov 2024, Optimizing Prompt Engineering Methods for Enhanced Logical Reasoning in Transformer Models, RMEL ’24, November 4–7, 2024, Hangzhou, China, https://www.researchgate.net/profile/Xiaoyan-Li-42/publication/389182048_Optimizing_Prompt_Engineering_Methods_for_Enhanced_Logical_Reasoning_in_Transformer_Models/links/67b82fa9461fb56424e3fc72/Optimizing-Prompt-Engineering-Methods-for-Enhanced-Logical-Reasoning-in-Transformer-Models.pdf https://github.com/xiaoyanLi629/RMELS2024
  • Anushka Tiwari, Sayantan Pal, Rohini K. Srihari, Kaiyi Ji, 19 Jul 2025, Task-Agnostic Continual Prompt Tuning with Gradient-Based Selection and Decoding, https://arxiv.org/abs/2507.14725
  • Lingyun Huang, Jianxu Mao, Junfei Yi, Ziming Tao, Yaonan Wang, 19 Jul 2025, CVPT: Cross Visual Prompt Tuning, https://arxiv.org/abs/2408.14961
  • Ruijun Feng, Hammond Pearce, Pietro Liguori, Yulei Sui, 21 Jul 2025, CGP-Tuning: Structure-Aware Soft Prompt Tuning for Code Vulnerability Detection, https://arxiv.org/abs/2501.04510
  • Jiong Yin, Liang Li, Jiehua Zhang, Yuhan Gao, Chenggang Yan, Xichun Sheng, 29 Jul 2025, Progressive Homeostatic and Plastic Prompt Tuning for Audio-Visual Multi-Task Incremental Learning, https://arxiv.org/abs/2507.21588
  • Fei Zhang, Tianfei Zhou, Jiangchao Yao, Ya Zhang, Ivor W. Tsang, Yanfeng Wang, 1 Aug 2025, Decouple before Align: Visual Disentanglement Enhances Prompt Tuning, https://arxiv.org/abs/2508.00395
  • Haitong Luo, Suhang Wang, Weiyao Zhang, Ruiqi Meng, Xuying Meng, Yujun Zhang, 15 Aug 2025, Generalize across Homophily and Heterophily: Hybrid Spectral Graph Pre-Training and Prompt Tuning, https://arxiv.org/abs/2508.11328
  • Zian Zhai, Sima Qing, Xiaoyang Wang, Wenjie Zhang, 17 Aug 2025, SGPT: Few-Shot Prompt Tuning for Signed Graphs, https://arxiv.org/abs/2412.12155
  • Pi-Wei Chen, Jerry Chun-Wei Lin, Wei-Han Chen, Jia Ji, Zih-Ching Chen, Feng-Hao Yeh, Chao-Chun Chen, 22 Aug 2025, Beyond Human-prompting: Adaptive Prompt Tuning with Semantic Alignment for Anomaly Detection, https://arxiv.org/abs/2508.16157
  • Finn Rietz, Oleg Smirnov, Sara Karimi, Lele Cao, 18 Jul 2025, Prompt-Tuning Bandits: Enabling Few-Shot Generalization for Efficient Multi-Task Offline RL, https://arxiv.org/abs/2502.06358
  • Ivan Zhang, 10 Aug 2025, A Real-Time, Self-Tuning Moderator Framework for Adversarial Prompt Detection, https://arxiv.org/abs/2508.07139
  • Ali Shakeri, Wei Emma Zhang, Amin Beheshti, Weitong Chen, Jian Yang and Lishan Yang, 22 Jul 2025, FedDPG: An Adaptive Yet Efficient Prompt-tuning Approach in Federated Learning Settings, https://arxiv.org/abs/2507.19534
  • Xinxu Wei, Kanhao Zhao, Yong Jiao, Lifang He and Yu Zhang, 3 Aug 2025, A Brain Graph Foundation Model: Pre-Training and Prompt-Tuning for Any Atlas and Disorder, https://arxiv.org/abs/2506.02044
  • Han Gao, Timo Hartmann, Botao Zhong, Kai Lia, Hanbin Luo, 5 Aug 2025, Domain-Specific Fine-Tuning and Prompt-Based Learning: A Comparative Study for developing Natural Language-Based BIM Information Retrieval Systems, https://arxiv.org/abs/2508.05676
  • Ryo Takahashi, Naoki Saito, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama, 30 Aug 2025, Discrete Prompt Tuning via Recursive Utilization of Black-box Multimodal Large Language Model for Personalized Visual Emotion Recognition, https://arxiv.org/abs/2509.04480
  • Tiandi Ye, Wenyan Liu, Kai Yao, Lichun Li, Shangchao Su, Cen Chen, Xiang Li, Shan Yin, Ming Gao, 27 Aug 2025, Towards Instance-wise Personalized Federated Learning via Semi-Implicit Bayesian Prompt Tuning, https://arxiv.org/abs/2508.19621
  • Lijun Sheng, Jian Liang, Zilei Wang, Ran He, 27 Aug 2025, R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning, https://arxiv.org/abs/2504.11195
  • Maxime Meyer, Mario Michelessa, Caroline Chaux, Vincent Y. F. Tan, 30 Aug 2025, Memory Limitations of Prompt Tuning in Transformers, https://arxiv.org/abs/2509.00421
  • Runjia Zeng, Guangyan Sun, Qifan Wang, Tong Geng, Sohail Dianat, Xiaotian Han, Raghuveer Rao, Xueling Zhang, Cheng Han, Lifu Huang, Dongfang Liu, 31 Aug 2025, MEPT: Mixture of Expert Prompt Tuning as a Manifold Mapper, https://arxiv.org/abs/2509.00996
  • Yutong Gao, Maoyuan Shao, Xinyang Huang, Chuang Zhu, Lijuan Sun, Yu Weng, Xuan Liu, Guoshun Nan, 31 Aug 2025, Spotlighter: Revisiting Prompt Tuning from a Representative Mining View, https://arxiv.org/abs/2509.00905
  • Ahmad Pouramini, Hesham Faili, 11 Sep 2025, Enhancing Few-Shot Transfer Learning with Optimized Multi-Task Prompt Tuning through Modular Prompt Composition, https://arxiv.org/abs/2408.13227
  • Xu Li and Fan Lyu, 11 Sep 2025, MM-Prompt: Cross-Modal Prompt Tuning for Continual Visual Question Answering, https://arxiv.org/abs/2505.19455
  • Rodrigo M Carrillo-Larco, 16 Sep 2025, LLMs for energy and macronutrients estimation using only text data from 24-hour dietary recalls: a parameter-efficient fine-tuning experiment using a 10-shot prompt, https://arxiv.org/abs/2509.13268
  • Ahmad Pouramini and Hesham Faili, 11 Sep 2025, CrossPT: Exploring Cross-Task Transferability through Multi-Task Prompt Tuning, https://arxiv.org/abs/2509.14253
  • Gustavo Sandoval, Denys Fenchenko and Junyao Chen, 15 Sep 2025, Early Approaches to Adversarial Fine-Tuning for Prompt Injection Defense: A 2022 Study of GPT-3 and Contemporary Models, https://arxiv.org/abs/2509.14271
  • Pittawat Taveekitworachai, Potsawee Manakul, Sarana Nutanong, Kunat Pipatanakul, 10 Sep 2025, Prior Prompt Engineering for Reinforcement Fine-Tuning, https://arxiv.org/abs/2505.14157
  • Finn Rietz, Oleg Smirnov, Sara Karimi, Lele Cao, 1 Oct 2025, Prompt Tuning Decision Transformers with Structured and Scalable Bandits, https://arxiv.org/abs/2502.04979
  • Matteo Fuoli, Weihang Huang, Jeannette Littlemore, Sarah Turner, Ellen Wilding, 1 Oct 2025, Metaphor identification using large language models: A comparison of RAG, prompt engineering, and fine-tuning, https://arxiv.org/abs/2509.24866
  • Yiyang Liu, James C. Liang, Heng Fan, Wenhao Yang, Yiming Cui, Xiaotian Han, Lifu Huang, Dongfang Liu, Qifan Wang, Cheng Han, 19 Oct 2025, All You Need is One: Capsule Prompt Tuning with a Single Vector, https://arxiv.org/abs/2510.16670
  • Tim Genewein, Li Kevin Wenliang, Jordi Grau-Moya, Anian Ruoss, Laurent Orseau, Marcus Hutter, 17 Oct 2025, Understanding Prompt Tuning and In-Context Learning via Meta-Learning, https://arxiv.org/abs/2505.17010
  • Zesheng Ye, Chengyi Cai, Ruijiang Dong, Jianzhong Qi, Lei Feng, Pin-Yu Chen, Feng Liu, 20 Oct 2025, Neural Network Reprogrammability: A Unified Theme on Model Reprogramming, Prompt Tuning, and Prompt Instruction, https://arxiv.org/abs/2506.04650
  • Jinglong Luo, Zhuo Zhang, Yehong Zhang, Shiyu Liu, Ye Dong, Hui Wang, Yue Yu, Xun Zhou, Zenglin Xu, 26 Sep 2025, SecP-Tuning: Efficient Privacy-Preserving Prompt Tuning for Large Language Models via MPC, https://arxiv.org/abs/2506.15307
  • Hang Hua, Yolo Yunlong Tang, Chenliang Xu, Jiebo Luo, 8 Oct 2025, V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning, https://arxiv.org/abs/2404.12353
  • Nayeong Kim, Seong Joon Oh, Suha Kwak, 28 Sep 2025, GroupCoOp: Group-robust Fine-tuning via Group Prompt Learning, https://arxiv.org/abs/2509.23781
  • Chenxing Wei, Yao Shu, Mingwen Ou, Ying Tiffany He, Fei Richard Yu, 27 Sep 2025, PAFT: Prompt-Agnostic Fine-Tuning, https://arxiv.org/abs/2502.12859
  • Jisu Han, Wonjun Hwang, 17 Oct 2025, D-TPT: Dimensional Entropy Maximization for Calibrating Test-Time Prompt Tuning in Vision-Language Models, https://arxiv.org/abs/2510.09473
  • Chikai Shang, Mengke Li, Yiqun Zhang, Zhen Chen, Jinlin Wu, Fangqing Gu, Yang Lu, Yiu-ming Cheung, 6 Oct 2025, PRO-VPT: Distribution-Adaptive Visual Prompt Tuning via Prompt Relocation, https://arxiv.org/abs/2503.06901
  • Yifan Yang, Zhen Zhang, Rupak Vignesh Swaminathan, Jing Liu, Nathan Susanj, Zheng Zhang, 24 Oct 2025, SharpZO: Hybrid Sharpness-Aware Vision Language Model Prompt Tuning via Forward-Only Passes, https://arxiv.org/abs/2506.20990
  • Xiaoyu Xue, Yuni Lai, Chenxi Huang, Yulin Zhu, Gaolei Li, Xiaoge Zhang, Kai Zhou, 16 Oct 2025, Stealthy Dual-Trigger Backdoors: Attacking Prompt Tuning in LM-Empowered Graph Foundation Models, https://arxiv.org/abs/2510.14470
  • Xiaoxue Yang, Bozhidar Stevanoski, Matthieu Meeus, Yves-Alexandre de Montjoye, 16 Oct 2025, Checkpoint-GCG: Auditing and Attacking Fine-Tuning-Based Prompt Injection Defenses, https://arxiv.org/abs/2505.15738

Research Papers on PEFT

PEFT is a popular technique that receives a lot of research attention:

AI Books from Aussie AI



The Sweetest Lesson: Your Brain Versus AI The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
  • Your brain is 50 times bigger than the best AI engines.
  • Truly intelligent AI will require more compute!
  • Another case of the bitter lesson?
  • Maybe it's the opposite of that: the sweetest lesson.

Get your copy from Amazon: The Sweetest Lesson



RAG Optimization RAG Optimization: Accurate and Efficient LLM Applications: new book on RAG architectures:
  • Smarter RAG
  • Faster RAG
  • Cheaper RAG
  • Agentic RAG
  • RAG reasoning

Get your copy from Amazon: RAG Optimization



Generative AI in C++ Generative AI Applications book:
  • Deciding on your AI project
  • Planning for success and safety
  • Designs and LLM architectures
  • Expediting development
  • Implementation and deployment

Get your copy from Amazon: Generative AI Applications



Generative AI in C++ Generative AI programming book:
  • Generative AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++



CUDA C++ Optimization CUDA C++ Optimization book:
  • Faster CUDA C++ kernels
  • Optimization tools & techniques
  • Compute optimization
  • Memory optimization

Get your copy from Amazon: CUDA C++ Optimization



CUDA C++ Optimization CUDA C++ Debugging book:
  • Debugging CUDA C++ kernels
  • Tools & techniques
  • Self-testing & reliability
  • Common GPU kernel bugs

Get your copy from Amazon: CUDA C++ Debugging

More AI Research

Read more about: