Aussie AI
Attention Steering
-
Last Updated 27 August, 2025
-
by David Spuler, Ph.D.
What is Attention Steering?
Attention steering is a method to "steer" or focus the LLM attention algorithm onto a particular subset of the tokens. This aims for more accurate and faster attention computations.
Research on Attention Steering
Research papers on attention steering:- Zhuohan Gu, Jiayi Yao, Kuntai Du, Junchen Jiang, 21 Nov 2024 (v2), LLMSteer: Improving Long-Context LLM Inference by Steering Attention on Reused Contexts, https://arxiv.org/abs/2411.13009
- Qingru Zhang, Chandan Singh, Liyuan Liu, Xiaodong Liu, Bin Yu, Jianfeng Gao, Tuo Zhao, 1 Oct 2024 (v2), Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs, https://arxiv.org/abs/2311.02262 https://github.com/QingruZhang/PASTA
- Baifeng Shi, Siyu Gai, Trevor Darrell, Xin Wang, 11 Jul 2023 (v2), TOAST: Transfer Learning via Attention Steering, https://arxiv.org/abs/2305.15542 https://github.com/bfshi/TOAST
- Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng Jin, 20 Aug 2024 (v3), PrimeComposer: Faster Progressively Combined Diffusion for Image Composition with Attention Steering, https://arxiv.org/abs/2403.05053 https://github.com/CodeGoat24/PrimeComposer
- Haoran Wang, Kai Shu, Jan 2025, MakeEveryTokenCount: ASystematic Survey on Decoding Methods for Foundation Model, https://www.researchgate.net/profile/Haoran-Wang-96/publication/387703971_Make_Every_Token_Count_A_Systematic_Survey_on_Decoding_Methods_for_Foundation_Models/links/67784c8ce74ca64e1f49eb15/Make-Every-Token-Count-A-Systematic-Survey-on-Decoding-Methods-for-Foundation-Models.pdf https://github.com/wang2226/Awesome-LLM-Decoding
- Kyle O'Brien, David Majercak, Xavier Fernandes, Richard Edgar, Jingya Chen, Harsha Nori, Dean Carignan, Eric Horvitz, Forough Poursabzi-Sangde, 18 Nov 2024, Steering Language Model Refusal with Sparse Autoencoders, https://arxiv.org/abs/2411.11296
- Xintong Wang, Jingheng Pan, Longqin Jiang, Liang Ding, Xingshan Li, Chris Biemann, 23 Oct 2024, CogSteer: Cognition-Inspired Selective Layer Intervention for Efficient Semantic Steering in Large Language Models, https://arxiv.org/abs/2410.17714
- Neel Nanda, 8th Jul 2024, An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2, https://www.alignmentforum.org/posts/NfFST5Mio7BCAQHPA/an-extremely-opinionated-annotated-list-of-my-favourite
- Yixin Ji, Juntao Li, Hai Ye, Kaixin Wu, Jia Xu, Linjian Mo, Min Zhang, 5 Jan 2025, Test-time Computing: from System-1 Thinking to System-2 Thinking, https://arxiv.org/abs/2501.02497
- Hanyu Zhang, Xiting Wang, Chengao Li, Xiang Ao, Qing He, 10 Jan 2025, Controlling Large Language Models Through Concept Activation Vectors, https://arxiv.org/abs/2501.05764 (Training a vector used to control the model on certain attributes.)
- Qi Sun, Edoardo Cetin, Yujin Tang, 14 Jan 2025 (v2), Transformer2: Self-adaptive LLMs, https://arxiv.org/abs/2501.06252 (Using a vector to fine-tuning dynamically.)
- Liu Yang, Ziqian Lin, Kangwook Lee, Dimitris Papailiopoulos, Robert Nowak, 16 Jan 2025, Task Vectors in In-Context Learning: Emergence, Formation, and Benefit, https://arxiv.org/abs/2501.09240
- Dan Zhang, Tao Feng, Lilong Xue, Yuandong Wang, Yuxiao Dong, Jie Tang, 23 Jan 2025, Parameter-Efficient Fine-Tuning for Foundation Models, https://arxiv.org/abs/2501.13787
- Xinyu Ma, Yifeng Xu, Yang Lin, Tianlong Wang, Xu Chu, Xin Gao, Junfeng Zhao, Yasha Wang, 24 Jan 2025, DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing, https://arxiv.org/abs/2501.14371 https://github.com/ArthurLeoM/DRESS-LLM
- Peixuan Han, Cheng Qian, Xiusi Chen, Yuji Zhang, Denghui Zhang, Heng Ji, 4 Feb 2025 (v2), Internal Activation as the Polar Star for Steering Unsafe LLM Behavior, https://arxiv.org/abs/2502.01042
- Daniel Beaglehole, Adityanarayanan Radhakrishnan, Enric Boix-Adserà, Mikhail Belkin, 6 Feb 2025, Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers,0 https://arxiv.org/abs/2502.03708 https://github.com/dmbeaglehole/neural_controllers
- Nikhil Anand, Dec 20, 2024, Understanding “steering” in LLMs And how simple math can solve global problems. https://ai.gopubby.com/understanding-steering-in-llms-96faf6e0bee7
- Somnath Banerjee, Sayan Layek, Pratyush Chatterjee, Animesh Mukherjee, Rima Hazra, 16 Feb 2025, Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment, https://arxiv.org/abs/2502.11244
- Lukasz Bartoszcze, Sarthak Munshi, Bryan Sukidi, Jennifer Yen, Zejia Yang, David Williams-King, Linh Le, Kosi Asuzu, Carsten Maple, 24 Feb 2025, Representation Engineering for Large-Language Models: Survey and Research Challenges,https://arxiv.org/abs/2502.17601
- Yingbing Huang, Deming Chen, Abhishek K. Umrawal, 28 Feb 2025, JAM: Controllable and Responsible Text Generation via Causal Reasoning and Latent Vector Manipulation, https://arxiv.org/abs/2502.20684
- Seongheon Park, Xuefeng Du, Min-Hsuan Yeh, Haobo Wang, Yixuan Li, 1 Mar 2025, How to Steer LLM Latents for Hallucination Detection? https://arxiv.org/abs/2503.01917
- Marco Scialanga, Thibault Laugel, Vincent Grari, Marcin Detyniecki, 3 Mar 2025, SAKE: Steering Activations for Knowledge Editing, https://arxiv.org/abs/2503.01751
- Kenneth J. K. Ong, Lye Jia Jun, Hieu Minh "Jord" Nguyen, Seong Hah Cho, Natalia Pérez-Campanero Antolín, 17 Mar 2025, Identifying Cooperative Personalities in Multi-agent Contexts through Personality Steering with Representation Engineering, https://arxiv.org/abs/2503.12722
- Moreno D'Incà, Elia Peruzzo, Xingqian Xu, Humphrey Shi, Nicu Sebe, Massimiliano Mancini, 14 Mar 2025, Safe Vision-Language Models via Unsafe Weights Manipulation, https://arxiv.org/abs/2503.11742
- Changho Shin, Xinya Yan, Suenggwan Jo, Sungjun Cho, Shourjo Aditya Chaudhuri, Frederic Sala, 25 Mar 2025 (v2), TARDIS: Mitigating Temporal Misalignment via Representation Steering, https://arxiv.org/abs/2503.18693
- Jingcheng Niu, Xingdi Yuan, Tong Wang, Hamidreza Saghir, Amir H. Abdi, 14 May 2025, Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs, https://arxiv.org/abs/2505.09338
- Yao Huang, Huanran Chen, Shouwei Ruan, Yichi Zhang, Xingxing Wei, Yinpeng Dong, 28 May 2025, Mitigating Overthinking in Large Reasoning Models via Manifold Steering, https://arxiv.org/abs/2505.22411 https://github.com/Aries-iai/Manifold_Steering
- Runjin Chen, Andy Arditi, Henry Sleight, Owain Evans, Jack Lindsey, 29 Jul 2025, Persona Vectors: Monitoring and Controlling Character Traits in Language Models, https://arxiv.org/abs/2507.21509
- Zhang, Qingru, May 2025, On the Efficiency and Steerability of Self-Attention Mechanism of Large Language Models, Ph.D. Thesis, Georgia Institute of Technology, https://hdl.handle.net/1853/77839 https://repository.gatech.edu/entities/publication/d14aeab0-0189-42cb-9cbb-36eeb4434dcb (Coverage of efficiency with mixed attention span KV cache compression, and attention steering.)
- Yichen Li, Zhiting Fan, Ruizhe Chen, Xiaotang Gai, Luqi Gong, Yan Zhang, Zuozhu Liu, 5 Jul 2025 (v2), FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering, https://arxiv.org/abs/2504.14492
- Xinyan Jiang, Lin Zhang, Jiayi Zhang, Qingsong Yang, Guimin Hu, Di Wang, Lijie Hu, 14 Aug 2025, MSRS: Adaptive Multi-Subspace Representation Steering for Attribute Alignment in Large Language Models, https://arxiv.org/abs/2508.10599
- Helena Casademunt, Caden Juang, Adam Karvonen, Samuel Marks, Senthooran Rajamanoharan, Neel Nanda, 22 Jul 2025, Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning, https://arxiv.org/abs/2507.16795
- Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal, 24 Jul 2025, GrAInS: Gradient-based Attribution for Inference-Time Steering of LLMs and VLMs, https://arxiv.org/abs/2507.18043
- Cheng-Ting Chou, George Liu, Jessica Sun, Cole Blondin, Kevin Zhu, Vasu Sharma, Sean O'Brien, 17 Jul 2025, Causal Language Control in Multilingual Transformers via Sparse Feature Steering, https://arxiv.org/abs/2507.13410
- Raghav Singhal, Zachary Horvitz, Ryan Teehan, Mengye Ren, Zhou Yu, Kathleen McKeown, Rajesh Ranganath, 18 Jul 2025, A General Framework for Inference-time Scaling and Steering of Diffusion Models, https://arxiv.org/abs/2501.06848
- Constantin Venhoff, Iv\'an Arcuschin, Philip Torr, Arthur Conmy, Neel Nanda, 17 Jul 2025, Understanding Reasoning in Thinking Language Models via Steering Vectors, https://arxiv.org/abs/2506.18167
- Zhi Zhong, Akira Takahashi, Shuyang Cui, Keisuke Toyama, Shusuke Takahashi, Yuki Mitsufuji, 17 Jul 2025, SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet, https://arxiv.org/abs/2505.16195
- Simon Kohaut and Felix Divo and Navid Hamid and Benedict Flade and Julian Eggert and Devendra Singh Dhami and Kristian Kersting, 21 Jul 2025, The Constitutional Controller: Doubt-Calibrated Steering of Compliant Agents, https://arxiv.org/abs/2507.15478
- Anirudh Sundar, Sinead Williamson, Katherine Metcalf, Barry-John Theobald, Skyler Seto, Masha Fedzechkina, 21 Jul 2025, Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models, https://arxiv.org/abs/2502.15639
- Taewook Kim, Dhruv Agarwal, Jordan Ackerman, Manaswi Saha, 10 Aug 2025, Steering AI-Driven Personalization of Scientific Text for General Audiences, https://arxiv.org/abs/2411.09969
- Haiyan Zhao, Xuansheng Wu, Fan Yang, Bo Shen, Ninghao Liu, Mengnan Du, 29 Jul 2025, Denoising Concept Vectors with Sparse Autoencoders for Improved Language Model Steering, https://arxiv.org/abs/2505.15038
- Sunghyun Park, Seokeon Choi, Hyoungwoo Park, Sungrack Yun, 1 Aug 2025, Steering Guidance for Personalized Text-to-Image Diffusion Models, https://arxiv.org/abs/2508.00319
- Tianxin Xie, Shan Yang, Chenxing Li, Dong Yu, Li Liu, 5 Aug 2025, EmoSteer-TTS: Fine-Grained and Training-Free Emotion-Controllable Text-to-Speech via Activation Steering, https://arxiv.org/abs/2508.03543
- Renmiao Chen, Shiyao Cui, Xuancheng Huang, Chengwei Pan, Victor Shea-Jay Huang, QingLin Zhang, Xuan Ouyang, Zhexin Zhang, Hongning Wang, and Minlie Huang, 7 Aug 2025, JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering, https://arxiv.org/abs/2508.05087
- Haruto Nakashima, Siddhartha Ganguly, Kenji Kashima, 8 Aug 2025, Data-Driven Density Steering via the Gromov-Wasserstein Optimal Transport Distance, https://arxiv.org/abs/2508.06052
- Gabriel Grand, Joshua B. Tenenbaum, Vikash K. Mansinghka, Alexander K. Lew, Jacob Andreas, 8 Aug 2025, Self-Steering Language Models, https://arxiv.org/abs/2504.07081
- Shivam Dubey, 12 Aug 2025, Activation Steering for Bias Mitigation: An Interpretable Approach to Safer LLMs, https://arxiv.org/abs/2508.09019
- Mansi Phute (Georgia Tech), Ravikumar Balakrishnan (HiddenLayer), 11 Aug 2025, VISOR: Visual Input-based Steering for Output Redirection in Vision-Language Models, https://arxiv.org/abs/2508.08521
- Afrozah Nadeem, Mark Dras, Usman Naseem, 12 Aug 2025, Steering Towards Fairness: Mitigating Political Bias in LLMs, https://arxiv.org/abs/2508.08846
- Pegah Khayatan, Mustafa Shukor, Jayneel Parekh, Arnaud Dapogny, Matthieu Cord, 13 Aug 2025, Analyzing Finetuning Representation Shift for Multimodal LLMs Steering, https://arxiv.org/abs/2501.03012
- Jacob Dunefsky, Arman Cohan, 12 Aug 2025, One-shot Optimized Steering Vectors Mediate Safety-relevant Behaviors in LLMs, https://arxiv.org/abs/2502.18862
- Zara Siddique, Irtaza Khalid, Liam D. Turner, Luis Espinosa-Anke, 13 Aug 2025, Shifting Perspectives: Steering Vectors for Robust Bias Mitigation in LLMs, https://arxiv.org/abs/2503.05371
- Jayneel Parekh, Pegah Khayatan, Mustafa Shukor, Arnaud Dapogny, Alasdair Newson, Matthieu Cord, 18 Aug 2025, Learning to Steer: Input-dependent Steering for Multimodal LLMs, https://arxiv.org/abs/2508.12815
- Seonglae Cho, Zekun Wu, Adriano Koshiyama, 18 Aug 2025, CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection, https://arxiv.org/abs/2508.12535
- Guillermo Sarasa Dur\'an, Ana Granados Fontecha, Francisco de Borja Rodr\'iguez Ort\'iz, 20 Aug 2025, Context Steering: A New Paradigm for Compression-based Embeddings by Synthesizing Relevant Information Features, https://arxiv.org/abs/2508.14780
- Yizhi Wang, Degang Xu, Yongfang Xie, Shuzhong Tan, Xianan Zhou, and Peng Chen, 22 Aug 2025, Hierarchical Decision-Making for Autonomous Navigation: Integrating Deep Reinforcement Learning and Fuzzy Logic in Four-Wheel Independent Steering and Driving Systems, https://arxiv.org/abs/2508.16574
- Jinwei Gan, Zifeng Cheng, Zhiwei Jiang, Cong Wang, Yafeng Yin, Xiang Luo, Yuchen Fu, Qing Gu, 25 Aug 2025, Steering When Necessary: Flexible Steering Large Language Models with Backtracking, https://arxiv.org/abs/2508.17621
- Hanjiang Hu, Alexander Robey, Changliu Liu, 25 Aug 2025, Steering Dialogue Dynamics for Robustness against Multi-turn Jailbreaking Attacks, https://arxiv.org/abs/2503.00187
More Attention Research Topics
Related LLM research areas for long context optimization of the attention methods include:
- Attention optimization (main page)
- Local attention
- Linear attention
- Sparse attention
- Multi-Head Attention (MHA)
- Muti-Query Attention (MQA)
- Group-Query Attention (GQA)
- Flash attention
- Paged attention
Other topics in attention research:
- Low-rank matrix attention
- Medusa attention
- Block attention
- Cross attention
- Fused head attention
- Hybrid local-global attention
- FFT attention
- QKV computation optimizations
- Additive attention
- Multiplicative attention
- Graph attention
- Chunked attention
- Attention sink
- Attention steering
- Bilinear attention
- Attention-free methods
- Mixture-of-Heads (MOH) Attention (MoE+MHA)
- Star attention
- Ring attention
AI Books from Aussie AI
![]() |
The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
Get your copy from Amazon: The Sweetest Lesson |
![]() |
RAG Optimization: Accurate and Efficient LLM Applications:
new book on RAG architectures:
Get your copy from Amazon: RAG Optimization |
![]() |
Generative AI Applications book:
Get your copy from Amazon: Generative AI Applications |
![]() |
Generative AI programming book:
Get your copy from Amazon: Generative AI in C++ |
![]() |
CUDA C++ Optimization book:
Get your copy from Amazon: CUDA C++ Optimization |
![]() |
CUDA C++ Debugging book:
Get your copy from Amazon: CUDA C++ Debugging |
More AI Research
Read more about: