Aussie AI
LLM Memory Architectures
-
Last Updated 29 August, 2025
-
by David Spuler, Ph.D.
What are LLM Memory Architectures?
LLM memory is the use of extra information storage so that AI engines can retain information. For example, if you interact repeatedly with a model, you want it to remember your name. By default, LLMs don't have this type of permanent memory and are stateless architectures.
This page is about allowing AIs to have memory for facts. A separate page examines the other type of "memory" (chips) inside the computers that run AI: memory optimizations for LLM backend coding.
Research on LLM Memory Architectures
Research papers include:
- Shenggang Li, Jul 30, 2024, Mem0: Is This the Future of AI Memory Management? https://ai.gopubby.com/mem0-is-this-the-future-of-ai-memory-management-1e228dc8220a
- Aurimas Griciūnas, Oct 30, 2024, Memory in Agent Systems, https://www.newsletter.swirlai.com/p/memory-in-agent-systems
- Zihong He, Weizhe Lin, Hao Zheng, Fan Zhang, Matt Jones, Laurence Aitchison, Xuhai Xu, Miao Liu, Per Ola Kristensson, Junxiao Shen, 1 Nov 2024, Human-inspired Perspectives: A Survey on AI Long-term Memory, https://arxiv.org/abs/2411.00489
- Debmalya Biswas, Dec 2024, Long-term Memory for AI Agents: Why Vector Databases are not sufficient for Memory Management of Agentic AI Systems? https://ai.gopubby.com/long-term-memory-for-agentic-ai-systems-4ae9b37c6c0f
- Mingda Chen, Yang Li, Karthik Padthe, Rulin Shao, Alicia Sun, Luke Zettlemoyer, Gargi Gosh, Wen-tau Yih, 24 Dec 2024, Improving Factuality with Explicit Working Memory, https://arxiv.org/abs/2412.18069
- Ben Dickson, December 13, 2024, New LLM optimization technique slashes memory costs up to 75%, https://venturebeat.com/ai/new-llm-optimization-technique-slashes-memory-costs-up-to-75/
- Edoardo Cetin, Qi Sun, Tianyu Zhao, Yujin Tang, 6 Dec 2024 (v3), An Evolved Universal Transformer Memory, https://arxiv.org/abs/2410.13166
- Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang, 21 Nov 2024 (v2), Disentangling Memory and Reasoning Ability in Large Language Models, https://arxiv.org/abs/2411.13504 https://github.com/MingyuJ666/Disentangling-Memory-and-Reasoning
- Alhassan Mumuni, Fuseini Mumuni, 6 Jan 2025, Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches, https://arxiv.org/abs/2501.03151
- Andrea Matarazzo, Riccardo Torlone, 3 Jan 2025, A Survey on Large Language Models with some Insights on their Capabilities and Limitations, https://arxiv.org/abs/2501.04040 (Broad survey with many LLM topics covered from history to architectures to optimizations.)
- Ben Dickson, January 16, 2025, Google’s new neural-net LLM architecture separates memory components to control exploding costs of capacity and compute, https://venturebeat.com/ai/googles-new-neural-net-architecture-separates-memory-components-to-control-exploding-costs/
- Mohamed A. Taha, 14 Jan 2025, Logarithmic Memory Networks (LMNs): Efficient Long-Range Sequence Modeling for Resource-Constrained Environments, https://arxiv.org/abs/2501.07905
- Ali Behrouz, Peilin Zhong, Vahab Mirrokni, 31 Dec 2024, Titans: Learning to Memorize at Test Time, https://arxiv.org/abs/2501.00663
- Tong Xiao, Jingbo Zhu, 16 Jan 2025, Foundations of Large Language Models, https://arxiv.org/abs/2501.09223 (Huge 230 page paper on many topics such as training, prompting, alignment, and long context.)
- Sergey Legtchenko, Ioan Stefanovici, Richard Black, Antony Rowstron, Junyi Liu, Paolo Costa, Burcu Canakci, Dushyanth Narayanan, Xingbo Wu, 16 Jan 2025, Managed-Retention Memory: A New Class of Memory for the AI Era, https://arxiv.org/abs/2501.09605
- Dr. Ashish Bamania, Jan 2025, Memory Layers Are Supercharging LLMs Like Never Before, https://levelup.gitconnected.com/memory-layers-are-supercharging-llms-like-never-before-056b99ea75cd
- Vincent-Pierre Berges, Barlas Oğuz, Daniel Haziza, Wen-tau Yih, Luke Zettlemoyer, Gargi Ghosh, 20 Dec 2024 (v2), Memory Layers at Scale, https://arxiv.org/abs/2412.09764 https://github.com/facebookresearch/memory
- Guillaume Lample, Alexandre Sablayrolles, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou, 16 Dec 2019 (v2), Large Memory Layers with Product Keys, https://arxiv.org/abs/1907.05242
- Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus, 24 Nov 2015 (v5), End-To-End Memory Networks, https://arxiv.org/abs/1503.08895 (Early paper as precursor to memory layers.)
- Paul Sawers, January 23, 2025, Meta’s Yann LeCun predicts a ‘new AI architectures paradigm’ within 5 years and ‘decade of robotics’, https://techcrunch.com/2025/01/23/metas-yann-lecun-predicts-a-new-ai-architectures-paradigm-within-5-years-and-decade-of-robotics/
- Haomiao Xiong, Zongxin Yang, Jiazuo Yu, Yunzhi Zhuge, Lu Zhang, Jiawen Zhu, Huchuan Lu, 23 Jan 2025, Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge, https://arxiv.org/abs/2501.13468 https://github.com/hmxiong/StreamChat
- Libo Wang, 24 Jan 2025, Wormhole Memory: A Rubik's Cube for Cross-Dialogue Retrieval, https://arxiv.org/abs/2501.14846
- Avinash Patil, 5 Feb 2025, Advancing Reasoning in Large Language Models: Promising Methods and Approaches, https://arxiv.org/abs/2502.03671
- Wujiang Xu, Zujie Liang, Kai Mei, Hang Gao, Juntao Tan, Yongfeng Zhang, 17 Feb 2025, A-MEM: Agentic Memory for LLM Agents, https://arxiv.org/abs/2502.12110 https://github.com/WujiangXu/AgenticMemory
- Xiaoran Liu, Ruixiao Li, Mianqiu Huang, Zhigeng Liu, Yuerong Song, Qipeng Guo, Siyang He, Qiqi Wang, Linlin Li, Qun Liu, Yaqian Zhou, Xuanjing Huang, Xipeng Qiu, 24 Feb 2025, Thus Spake Long-Context Large Language Model, https://arxiv.org/abs/2502.17129 (Impressive survey of many techniques to improve efficiency and accuracy of long context processing in both inference and training, covering text, video and multimodal models.)
- Emilia David, March 5, 2025, Enhancing AI agents with long-term memory: Insights into LangMem SDK, Memobase and the A-MEM Framework, https://venturebeat.com/ai/enhancing-ai-agents-with-long-term-memory-insights-into-langmem-sdk-memobase-and-the-a-mem-framework/
- Asif Razzaq, March 8, 2025, Meet Manus: A New AI Agent from China with Deep Research + Operator + Computer Use + Lovable + Memory, https://www.marktechpost.com/2025/03/08/meet-manus-a-new-ai-agent-from-china-with-deep-research-operator-computer-use-lovable-memory/
- Mingyue Cheng, Yucong Luo, Jie Ouyang, Qi Liu, Huijie Liu, Li Li, Shuo Yu, Bohou Zhang, Jiawei Cao, Jie Ma, Daoyu Wang, Enhong Chen, 17 Mar 2025 (v2), A Survey on Knowledge-Oriented Retrieval-Augmented Generation, https://arxiv.org/abs/2503.10677
- Character.AI, May 19, 2025, Helping Characters Remember What Matters Most, https://blog.character.ai/helping-characters-remember-what-matters-most/
- Nir Diamant, Jun 29, 2025, Memory Optimization Strategies in AI Agents: Building Smarter Agents That Learn and Remember, https://diamantai.substack.com/p/memory-optimization-strategies-in
- Zeyu Zhang, Quanyu Dai, Xu Chen, Rui Li, Zhongyang Li, Zhenhua Dong, 4 May 2025, MemEngine: A Unified and Modular Library for Developing Advanced Memory of LLM-based Agents, https://arxiv.org/abs/2505.02099 https://github.com/nuster1128/MemEngine
- Yiming Du, Wenyu Huang, Danna Zheng, Zhaowei Wang, Sebastien Montella, Mirella Lapata, Kam-Fai Wong, Jeff Z. Pan, 27 May 2025 (v2), Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions, https://arxiv.org/abs/2505.00675 https://github.com/Elvin-Yiming-Du/Survey_Memory_in_AI
- Yaxiong Wu, Sheng Liang, Chen Zhang, Yichao Wang, Yongyue Zhang, Huifeng Guo, Ruiming Tang, Yong Liu, 23 Apr 2025 (v2), From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs, https://arxiv.org/abs/2504.15965
- Ningning Wang, Xavier Hu, Pai Liu, He Zhu, Yue Hou, Heyuan Huang, Shengyu Zhang, Jian Yang, Jiaheng Liu, Ge Zhang, Changwang Zhang, Jun Wang, Yuchen Eleanor Jiang, Wangchunshu Zhou, 24 Jul 2025, Efficient Agents: Building Effective Agents While Reducing Cost, https://arxiv.org/pdf/2508.02694 https://github.com/OPPO-PersonalAI/OAgents
- Emilia David, August 13, 2025, Google adds limited chat personalization to Gemini, trails Anthropic and OpenAI in memory features, https://venturebeat.com/ai/google-adds-limited-chat-personalization-to-gemini-trails-anthropic-and-openai-in-memory-features/
- Nathan Lambert, Aug 15, 2025, Contra Dwarkesh on Continual Learning: Don't try to make your airplane too much like a bird, https://www.interconnects.ai/p/contra-dwarkesh-on-continual-learning
- MacKenzie Sigalos, Aug 19 2025 Sam Altman on GPT-6: ‘People want memory’, https://www.cnbc.com/2025/08/19/sam-altman-on-gpt-6-people-want-memory.html
- Parsa Omidi, Xingshuai Huang, Axel Laborieux, Bahareh Nikpour, Tianyu Shi, and Armaghan Eshaghi, 14 Aug 2025, Memory-Augmented Transformers: A Systematic Review from Neuroscience Principles to Technical Solutions, https://arxiv.org/abs/2508.10824
- Daniel Szelogowski, 29 Jul 2025, Hebbian Memory-Augmented Recurrent Networks: Engram Neurons in Deep Learning, https://arxiv.org/abs/2507.21474
- Yi Kong, Dianxi Shi, Guoli Yang, Zhang ke-di, Chenlin Huang, Xiaopeng Li, Songchang Jin, 29 Jul 2025, MapAgent: Trajectory-Constructed Memory-Augmented Planning for Mobile Task Automation, https://arxiv.org/abs/2507.21953
- Leyi Ouyang, 2 Aug 2025, Can Memory-Augmented LLM Agents Aid Journalism in Interpreting and Framing News for Diverse Audiences?, https://arxiv.org/abs/2507.21055
- Yongyi Wang, Lingfeng Li, Bozhou Chen, Ang Li, Hanyu Liu, Qirui Zheng, Xionghui Yang, Wenxin Li, 6 Aug 2025, Synthetic POMDPs to Challenge Memory-Augmented RL: Memory Demand Structure Modeling, https://arxiv.org/abs/2508.04282
- Dongwook Choi, Taeyoon Kwon, Dongil Yang, Hyojun Kim, Jinyoung Yeo, 12 Aug 2025, Designing Memory-Augmented AR Agents for Spatiotemporal Reasoning in Personalized Task Assistance, https://arxiv.org/abs/2508.08774
- Gaoke Zhang, Bo Wang, Yunlong Ma, Dongming Zhao, Zifei Yu, 21 Aug 2025, Multiple Memory Systems for Enhancing the Long-term Memory of Agent, https://arxiv.org/abs/2508.15294
- Haoran Sun, Shaoning Zeng, 23 Jul 2025, Hierarchical Memory for High-Efficiency Long-Term Reasoning in LLM Agents, https://arxiv.org/abs/2507.22925
- Zhen Tan, Jun Yan, I-Hung Hsu, Rujun Han, Zifeng Wang, Long T. Le, Yiwen Song, Yanfei Chen, Hamid Palangi, George Lee, Anand Iyer, Tianlong Chen, Huan Liu, Chen-Yu Lee, Tomas Pfister, 28 Jul 2025, In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents, https://arxiv.org/abs/2503.08026
- Xiao Wang, Xufeng Lou, Shiao Wang, Ju Huang, Lan Chen, Bo Jiang, 6 Aug 2025, Long-Term Visual Object Tracking with Event Cameras: An Associative Memory Augmented Tracker and A Benchmark Dataset, https://arxiv.org/abs/2403.05839
- Qingyue Wang, Yanhe Fu, Yanan Cao, Shuai Wang, Zhiliang Tian, Liang Ding, 25 Aug 2025, Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models, https://arxiv.org/abs/2308.15022
AI Books from Aussie AI
![]() |
The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
Get your copy from Amazon: The Sweetest Lesson |
![]() |
RAG Optimization: Accurate and Efficient LLM Applications:
new book on RAG architectures:
Get your copy from Amazon: RAG Optimization |
![]() |
Generative AI Applications book:
Get your copy from Amazon: Generative AI Applications |
![]() |
Generative AI programming book:
Get your copy from Amazon: Generative AI in C++ |
![]() |
CUDA C++ Optimization book:
Get your copy from Amazon: CUDA C++ Optimization |
![]() |
CUDA C++ Debugging book:
Get your copy from Amazon: CUDA C++ Debugging |
More AI Research Topics
Read more about:
- 500+ LLM Inference Optimization Techniques
- What's Hot in LLM Inference Optimization in 2025?
- Inference Optimization Research
- « Research Home