Aussie AI

Attention Steering

  • Last Updated 22 October, 2025
  • by David Spuler, Ph.D.

What is Attention Steering?

Attention steering is a method to "steer" or focus the LLM attention algorithm onto a particular subset of the tokens. This aims for more accurate and faster attention computations.

Research on Attention Steering

Research papers on attention steering:
  • Zhuohan Gu, Jiayi Yao, Kuntai Du, Junchen Jiang, 21 Nov 2024 (v2), LLMSteer: Improving Long-Context LLM Inference by Steering Attention on Reused Contexts, https://arxiv.org/abs/2411.13009
  • Qingru Zhang, Chandan Singh, Liyuan Liu, Xiaodong Liu, Bin Yu, Jianfeng Gao, Tuo Zhao, 1 Oct 2024 (v2), Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs, https://arxiv.org/abs/2311.02262 https://github.com/QingruZhang/PASTA
  • Baifeng Shi, Siyu Gai, Trevor Darrell, Xin Wang, 11 Jul 2023 (v2), TOAST: Transfer Learning via Attention Steering, https://arxiv.org/abs/2305.15542 https://github.com/bfshi/TOAST
  • Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng Jin, 20 Aug 2024 (v3), PrimeComposer: Faster Progressively Combined Diffusion for Image Composition with Attention Steering, https://arxiv.org/abs/2403.05053 https://github.com/CodeGoat24/PrimeComposer
  • Haoran Wang, Kai Shu, Jan 2025, MakeEveryTokenCount: ASystematic Survey on Decoding Methods for Foundation Model, https://www.researchgate.net/profile/Haoran-Wang-96/publication/387703971_Make_Every_Token_Count_A_Systematic_Survey_on_Decoding_Methods_for_Foundation_Models/links/67784c8ce74ca64e1f49eb15/Make-Every-Token-Count-A-Systematic-Survey-on-Decoding-Methods-for-Foundation-Models.pdf https://github.com/wang2226/Awesome-LLM-Decoding
  • Kyle O'Brien, David Majercak, Xavier Fernandes, Richard Edgar, Jingya Chen, Harsha Nori, Dean Carignan, Eric Horvitz, Forough Poursabzi-Sangde, 18 Nov 2024, Steering Language Model Refusal with Sparse Autoencoders, https://arxiv.org/abs/2411.11296
  • Xintong Wang, Jingheng Pan, Longqin Jiang, Liang Ding, Xingshan Li, Chris Biemann, 23 Oct 2024, CogSteer: Cognition-Inspired Selective Layer Intervention for Efficient Semantic Steering in Large Language Models, https://arxiv.org/abs/2410.17714
  • Neel Nanda, 8th Jul 2024, An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2, https://www.alignmentforum.org/posts/NfFST5Mio7BCAQHPA/an-extremely-opinionated-annotated-list-of-my-favourite
  • Yixin Ji, Juntao Li, Hai Ye, Kaixin Wu, Jia Xu, Linjian Mo, Min Zhang, 5 Jan 2025, Test-time Computing: from System-1 Thinking to System-2 Thinking, https://arxiv.org/abs/2501.02497
  • Hanyu Zhang, Xiting Wang, Chengao Li, Xiang Ao, Qing He, 10 Jan 2025, Controlling Large Language Models Through Concept Activation Vectors, https://arxiv.org/abs/2501.05764 (Training a vector used to control the model on certain attributes.)
  • Qi Sun, Edoardo Cetin, Yujin Tang, 14 Jan 2025 (v2), Transformer2: Self-adaptive LLMs, https://arxiv.org/abs/2501.06252 (Using a vector to fine-tuning dynamically.)
  • Liu Yang, Ziqian Lin, Kangwook Lee, Dimitris Papailiopoulos, Robert Nowak, 16 Jan 2025, Task Vectors in In-Context Learning: Emergence, Formation, and Benefit, https://arxiv.org/abs/2501.09240
  • Dan Zhang, Tao Feng, Lilong Xue, Yuandong Wang, Yuxiao Dong, Jie Tang, 23 Jan 2025, Parameter-Efficient Fine-Tuning for Foundation Models, https://arxiv.org/abs/2501.13787
  • Xinyu Ma, Yifeng Xu, Yang Lin, Tianlong Wang, Xu Chu, Xin Gao, Junfeng Zhao, Yasha Wang, 24 Jan 2025, DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing, https://arxiv.org/abs/2501.14371 https://github.com/ArthurLeoM/DRESS-LLM
  • Peixuan Han, Cheng Qian, Xiusi Chen, Yuji Zhang, Denghui Zhang, Heng Ji, 4 Feb 2025 (v2), Internal Activation as the Polar Star for Steering Unsafe LLM Behavior, https://arxiv.org/abs/2502.01042
  • Daniel Beaglehole, Adityanarayanan Radhakrishnan, Enric Boix-Adserà, Mikhail Belkin, 6 Feb 2025, Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers,0 https://arxiv.org/abs/2502.03708 https://github.com/dmbeaglehole/neural_controllers
  • Nikhil Anand, Dec 20, 2024, Understanding “steering” in LLMs And how simple math can solve global problems. https://ai.gopubby.com/understanding-steering-in-llms-96faf6e0bee7
  • Somnath Banerjee, Sayan Layek, Pratyush Chatterjee, Animesh Mukherjee, Rima Hazra, 16 Feb 2025, Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment, https://arxiv.org/abs/2502.11244
  • Lukasz Bartoszcze, Sarthak Munshi, Bryan Sukidi, Jennifer Yen, Zejia Yang, David Williams-King, Linh Le, Kosi Asuzu, Carsten Maple, 24 Feb 2025, Representation Engineering for Large-Language Models: Survey and Research Challenges,https://arxiv.org/abs/2502.17601
  • Yingbing Huang, Deming Chen, Abhishek K. Umrawal, 28 Feb 2025, JAM: Controllable and Responsible Text Generation via Causal Reasoning and Latent Vector Manipulation, https://arxiv.org/abs/2502.20684
  • Seongheon Park, Xuefeng Du, Min-Hsuan Yeh, Haobo Wang, Yixuan Li, 1 Mar 2025, How to Steer LLM Latents for Hallucination Detection? https://arxiv.org/abs/2503.01917
  • Marco Scialanga, Thibault Laugel, Vincent Grari, Marcin Detyniecki, 3 Mar 2025, SAKE: Steering Activations for Knowledge Editing, https://arxiv.org/abs/2503.01751
  • Kenneth J. K. Ong, Lye Jia Jun, Hieu Minh "Jord" Nguyen, Seong Hah Cho, Natalia Pérez-Campanero Antolín, 17 Mar 2025, Identifying Cooperative Personalities in Multi-agent Contexts through Personality Steering with Representation Engineering, https://arxiv.org/abs/2503.12722
  • Moreno D'Incà, Elia Peruzzo, Xingqian Xu, Humphrey Shi, Nicu Sebe, Massimiliano Mancini, 14 Mar 2025, Safe Vision-Language Models via Unsafe Weights Manipulation, https://arxiv.org/abs/2503.11742
  • Changho Shin, Xinya Yan, Suenggwan Jo, Sungjun Cho, Shourjo Aditya Chaudhuri, Frederic Sala, 25 Mar 2025 (v2), TARDIS: Mitigating Temporal Misalignment via Representation Steering, https://arxiv.org/abs/2503.18693
  • Jingcheng Niu, Xingdi Yuan, Tong Wang, Hamidreza Saghir, Amir H. Abdi, 14 May 2025, Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs, https://arxiv.org/abs/2505.09338
  • Yao Huang, Huanran Chen, Shouwei Ruan, Yichi Zhang, Xingxing Wei, Yinpeng Dong, 28 May 2025, Mitigating Overthinking in Large Reasoning Models via Manifold Steering, https://arxiv.org/abs/2505.22411 https://github.com/Aries-iai/Manifold_Steering
  • Runjin Chen, Andy Arditi, Henry Sleight, Owain Evans, Jack Lindsey, 29 Jul 2025, Persona Vectors: Monitoring and Controlling Character Traits in Language Models, https://arxiv.org/abs/2507.21509
  • Zhang, Qingru, May 2025, On the Efficiency and Steerability of Self-Attention Mechanism of Large Language Models, Ph.D. Thesis, Georgia Institute of Technology, https://hdl.handle.net/1853/77839 https://repository.gatech.edu/entities/publication/d14aeab0-0189-42cb-9cbb-36eeb4434dcb (Coverage of efficiency with mixed attention span KV cache compression, and attention steering.)
  • Yichen Li, Zhiting Fan, Ruizhe Chen, Xiaotang Gai, Luqi Gong, Yan Zhang, Zuozhu Liu, 5 Jul 2025 (v2), FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering, https://arxiv.org/abs/2504.14492
  • Xinyan Jiang, Lin Zhang, Jiayi Zhang, Qingsong Yang, Guimin Hu, Di Wang, Lijie Hu, 14 Aug 2025, MSRS: Adaptive Multi-Subspace Representation Steering for Attribute Alignment in Large Language Models, https://arxiv.org/abs/2508.10599
  • Helena Casademunt, Caden Juang, Adam Karvonen, Samuel Marks, Senthooran Rajamanoharan, Neel Nanda, 22 Jul 2025, Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning, https://arxiv.org/abs/2507.16795
  • Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal, 24 Jul 2025, GrAInS: Gradient-based Attribution for Inference-Time Steering of LLMs and VLMs, https://arxiv.org/abs/2507.18043
  • Cheng-Ting Chou, George Liu, Jessica Sun, Cole Blondin, Kevin Zhu, Vasu Sharma, Sean O'Brien, 17 Jul 2025, Causal Language Control in Multilingual Transformers via Sparse Feature Steering, https://arxiv.org/abs/2507.13410
  • Raghav Singhal, Zachary Horvitz, Ryan Teehan, Mengye Ren, Zhou Yu, Kathleen McKeown, Rajesh Ranganath, 18 Jul 2025, A General Framework for Inference-time Scaling and Steering of Diffusion Models, https://arxiv.org/abs/2501.06848
  • Constantin Venhoff, Iv\'an Arcuschin, Philip Torr, Arthur Conmy, Neel Nanda, 17 Jul 2025, Understanding Reasoning in Thinking Language Models via Steering Vectors, https://arxiv.org/abs/2506.18167
  • Zhi Zhong, Akira Takahashi, Shuyang Cui, Keisuke Toyama, Shusuke Takahashi, Yuki Mitsufuji, 17 Jul 2025, SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet, https://arxiv.org/abs/2505.16195
  • Simon Kohaut and Felix Divo and Navid Hamid and Benedict Flade and Julian Eggert and Devendra Singh Dhami and Kristian Kersting, 21 Jul 2025, The Constitutional Controller: Doubt-Calibrated Steering of Compliant Agents, https://arxiv.org/abs/2507.15478
  • Anirudh Sundar, Sinead Williamson, Katherine Metcalf, Barry-John Theobald, Skyler Seto, Masha Fedzechkina, 21 Jul 2025, Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models, https://arxiv.org/abs/2502.15639
  • Taewook Kim, Dhruv Agarwal, Jordan Ackerman, Manaswi Saha, 10 Aug 2025, Steering AI-Driven Personalization of Scientific Text for General Audiences, https://arxiv.org/abs/2411.09969
  • Haiyan Zhao, Xuansheng Wu, Fan Yang, Bo Shen, Ninghao Liu, Mengnan Du, 29 Jul 2025, Denoising Concept Vectors with Sparse Autoencoders for Improved Language Model Steering, https://arxiv.org/abs/2505.15038
  • Sunghyun Park, Seokeon Choi, Hyoungwoo Park, Sungrack Yun, 1 Aug 2025, Steering Guidance for Personalized Text-to-Image Diffusion Models, https://arxiv.org/abs/2508.00319
  • Tianxin Xie, Shan Yang, Chenxing Li, Dong Yu, Li Liu, 5 Aug 2025, EmoSteer-TTS: Fine-Grained and Training-Free Emotion-Controllable Text-to-Speech via Activation Steering, https://arxiv.org/abs/2508.03543
  • Renmiao Chen, Shiyao Cui, Xuancheng Huang, Chengwei Pan, Victor Shea-Jay Huang, QingLin Zhang, Xuan Ouyang, Zhexin Zhang, Hongning Wang, and Minlie Huang, 7 Aug 2025, JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering, https://arxiv.org/abs/2508.05087
  • Haruto Nakashima, Siddhartha Ganguly, Kenji Kashima, 8 Aug 2025, Data-Driven Density Steering via the Gromov-Wasserstein Optimal Transport Distance, https://arxiv.org/abs/2508.06052
  • Gabriel Grand, Joshua B. Tenenbaum, Vikash K. Mansinghka, Alexander K. Lew, Jacob Andreas, 8 Aug 2025, Self-Steering Language Models, https://arxiv.org/abs/2504.07081
  • Shivam Dubey, 12 Aug 2025, Activation Steering for Bias Mitigation: An Interpretable Approach to Safer LLMs, https://arxiv.org/abs/2508.09019
  • Mansi Phute (Georgia Tech), Ravikumar Balakrishnan (HiddenLayer), 11 Aug 2025, VISOR: Visual Input-based Steering for Output Redirection in Vision-Language Models, https://arxiv.org/abs/2508.08521
  • Afrozah Nadeem, Mark Dras, Usman Naseem, 12 Aug 2025, Steering Towards Fairness: Mitigating Political Bias in LLMs, https://arxiv.org/abs/2508.08846
  • Pegah Khayatan, Mustafa Shukor, Jayneel Parekh, Arnaud Dapogny, Matthieu Cord, 13 Aug 2025, Analyzing Finetuning Representation Shift for Multimodal LLMs Steering, https://arxiv.org/abs/2501.03012
  • Jacob Dunefsky, Arman Cohan, 12 Aug 2025, One-shot Optimized Steering Vectors Mediate Safety-relevant Behaviors in LLMs, https://arxiv.org/abs/2502.18862
  • Zara Siddique, Irtaza Khalid, Liam D. Turner, Luis Espinosa-Anke, 13 Aug 2025, Shifting Perspectives: Steering Vectors for Robust Bias Mitigation in LLMs, https://arxiv.org/abs/2503.05371
  • Jayneel Parekh, Pegah Khayatan, Mustafa Shukor, Arnaud Dapogny, Alasdair Newson, Matthieu Cord, 18 Aug 2025, Learning to Steer: Input-dependent Steering for Multimodal LLMs, https://arxiv.org/abs/2508.12815
  • Seonglae Cho, Zekun Wu, Adriano Koshiyama, 18 Aug 2025, CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection, https://arxiv.org/abs/2508.12535
  • Guillermo Sarasa Dur\'an, Ana Granados Fontecha, Francisco de Borja Rodr\'iguez Ort\'iz, 20 Aug 2025, Context Steering: A New Paradigm for Compression-based Embeddings by Synthesizing Relevant Information Features, https://arxiv.org/abs/2508.14780
  • Yizhi Wang, Degang Xu, Yongfang Xie, Shuzhong Tan, Xianan Zhou, and Peng Chen, 22 Aug 2025, Hierarchical Decision-Making for Autonomous Navigation: Integrating Deep Reinforcement Learning and Fuzzy Logic in Four-Wheel Independent Steering and Driving Systems, https://arxiv.org/abs/2508.16574
  • Jinwei Gan, Zifeng Cheng, Zhiwei Jiang, Cong Wang, Yafeng Yin, Xiang Luo, Yuchen Fu, Qing Gu, 25 Aug 2025, Steering When Necessary: Flexible Steering Large Language Models with Backtracking, https://arxiv.org/abs/2508.17621
  • Hanjiang Hu, Alexander Robey, Changliu Liu, 25 Aug 2025, Steering Dialogue Dynamics for Robustness against Multi-turn Jailbreaking Attacks, https://arxiv.org/abs/2503.00187
  • Rushi Wang, Jiateng Liu, Cheng Qian, Yifan Shen, Yanzhou Pan, Zhaozhuo Xu, Ahmed Abbasi, Heng Ji, Denghui Zhang, 2 Sep 2025, Context Engineering for Trustworthiness: Rescorla Wagner Steering Under Mixed and Inappropriate Contexts, https://arxiv.org/abs/2509.04500
  • Asrin Efe Yorulmaz and Raj Kiriti Velicheti and Melih Bastopcu and Tamer Ba\c{s}ar, 29 Aug 2025, A Soft Inducement Framework for Incentive-Aided Steering of No-Regret Players, https://arxiv.org/abs/2508.21672
  • Konstantin Mark, Leonard Galustian, Maximilian P.-P. Kovar, Esther Heid, 1 Sep 2025, Feynman-Kac-Flow: Inference Steering of Conditional Flow Matching to an Energy-Tilted Posterior, https://arxiv.org/abs/2509.01543
  • Sihao Wu, Gaojie Jin, Wei Huang, Jianhong Wang, Xiaowei Huang, 30 Aug 2025, Activation Steering Meets Preference Optimization: Defense Against Jailbreaks in Vision Language Models, https://arxiv.org/abs/2509.00373
  • Bear H\"aon, Kaylene Stocking, Ian Chuang, and Claire Tomlin, 30 Aug 2025, Mechanistic interpretability for steering vision-language-action models, https://arxiv.org/abs/2509.00328
  • Diego Di Carlo (RIKEN AIP), Koyama Shoichi (UTokyo), Nugraha Aditya Arie (RIKEN AIP), Fontaine Mathieu (LTCI, S2A), Bando Yoshiaki (AIST), Yoshii Kazuyoshi (RIKEN AIP), 20 Aug 2025, Gaussian Process Regression of Steering Vectors With Physics-Aware Deep Composite Kernels for Augmented Listening, https://arxiv.org/abs/2509.02571
  • Viacheslav Sinii, Nikita Balagansky, Yaroslav Aksenov, Vadim Kurochkin, Daniil Laptev, Gleb Gerasimov, Alexey Gorbatovski, Boris Shaposhnikov, Daniil Gavrilov, 8 Sep 2025, Small Vectors, Big Effects: A Mechanistic Study of RL-Induced Reasoning via Steering Vectors, https://arxiv.org/abs/2509.06608
  • Viacheslav Sinii, Alexey Gorbatovski, Artem Cherepanov, Boris Shaposhnikov, Nikita Balagansky, Daniil Gavrilov, 8 Sep 2025, Steering LLM Reasoning Through Bias-Only Adaptation, https://arxiv.org/abs/2505.18706
  • Long-Kai Huang, Rongyi Zhu, Bing He, Jianhua Yao, 12 Sep 2025, Steering Protein Language Models, https://arxiv.org/abs/2509.07983
  • Mohsen Fayyaz, Ali Modarressi, Hanieh Deilamsalehy, Franck Dernoncourt, Ryan Rossi, Trung Bui, Hinrich Sch\"utze, Nanyun Peng, 11 Sep 2025, Steering MoE LLMs via Expert (De)Activation, https://arxiv.org/abs/2509.09660
  • Mohit Sharma, Amit Jayant Deshpande, Chiranjib Bhattacharyya, Rajiv Ratn Shah, 19 Sep 2025, On Optimal Steering to Achieve Exact Fairness, https://arxiv.org/abs/2509.15759
  • Caitlin Cisar, Emily Sheffield, Joshua Drake, Alden Harrell, Subramanian Chidambaram, Nikita Nangia, Vinayak Arannil, Alex Williams, 18 Sep 2025, PILOT: Steering Synthetic Data Generation with Psychological & Linguistic Output Targeting, https://arxiv.org/abs/2509.15447
  • Narmeen Oozeer, Luke Marks, Fazl Barez, Amirali Abdullah, 19 Sep 2025, Beyond Linear Steering: Unified Multi-Attribute Control for Language Models, https://arxiv.org/abs/2505.24535
  • Jeremias Ferrao, Matthijs van der Lende, Ilija Lichkovski, Clement Neo, 16 Sep 2025, The Anatomy of Alignment: Decomposing Preference Optimization by Steering Sparse Features, https://arxiv.org/abs/2509.12934
  • Ziwen Xu, Shuxun Wang, Kewei Xu, Haoming Xu, Mengru Wang, Xinle Deng, Yunzhi Yao, Guozhou Zheng, Huajun Chen, Ningyu Zhang, 14 Sep 2025, EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models, https://arxiv.org/abs/2504.15133
  • Zhenglin Hua, Jinghan He, Zijun Yao, Tianxu Han, Haiyun Guo, Yuheng Jia, Junfeng Fang, 15 Sep 2025, Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation, https://arxiv.org/abs/2505.16146
  • Neale Ratzlaff, Matthew Lyle Olson, Musashi Hinck, Estelle Aflalo, Shao-Yen Tseng, Vasudev Lal, Phillip Howard, 18 Sep 2025, Debias your Large Multi-Modal Model at Test-Time via Non-Contrastive Visual Attribute Steering, https://arxiv.org/abs/2411.12590
  • Vincent Siu, Nicholas Crispino, David Park, Nathan W. Henry, Zhun Wang, Yang Liu, Dawn Song, Chenguang Wang, 16 Sep 2025, SteeringControl: Holistic Evaluation of Alignment Steering in LLMs, https://arxiv.org/abs/2509.13450

More Attention Research Topics

Related LLM research areas for long context optimization of the attention methods include:

Other topics in attention research:

AI Books from Aussie AI



The Sweetest Lesson: Your Brain Versus AI The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
  • Your brain is 50 times bigger than the best AI engines.
  • Truly intelligent AI will require more compute!
  • Another case of the bitter lesson?
  • Maybe it's the opposite of that: the sweetest lesson.

Get your copy from Amazon: The Sweetest Lesson



RAG Optimization RAG Optimization: Accurate and Efficient LLM Applications: new book on RAG architectures:
  • Smarter RAG
  • Faster RAG
  • Cheaper RAG
  • Agentic RAG
  • RAG reasoning

Get your copy from Amazon: RAG Optimization



Generative AI in C++ Generative AI Applications book:
  • Deciding on your AI project
  • Planning for success and safety
  • Designs and LLM architectures
  • Expediting development
  • Implementation and deployment

Get your copy from Amazon: Generative AI Applications



Generative AI in C++ Generative AI programming book:
  • Generative AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++



CUDA C++ Optimization CUDA C++ Optimization book:
  • Faster CUDA C++ kernels
  • Optimization tools & techniques
  • Compute optimization
  • Memory optimization

Get your copy from Amazon: CUDA C++ Optimization



CUDA C++ Optimization CUDA C++ Debugging book:
  • Debugging CUDA C++ kernels
  • Tools & techniques
  • Self-testing & reliability
  • Common GPU kernel bugs

Get your copy from Amazon: CUDA C++ Debugging

More AI Research

Read more about: