Aussie AI

Vector Database Optimizations

  • Last Updated 20 August, 2025
  • by David Spuler, Ph.D.

Survey papers on Vector Databases

Review papers on vector databases:

Research on Vector Databases

Research papers on vector databases:

Vector Database Optimizations

Research papers on vector databases:

  • Dr. Ashish Bamania, Jun 18, 2024, Google’s New Algorithms Just Made Searching Vector Databases Faster Than Ever: A Deep Dive into how Google’s ScaNN and SOAR Search algorithms supercharge the performance of Vector Databases, https://levelup.gitconnected.com/googles-new-algorithms-just-made-searching-vector-databases-faster-than-ever-36073618d078
  • James Jie Pan, Jianguo Wang, Guoliang Li, 21 Oct 2023, Survey of Vector Database Management Systems, https://arxiv.org/abs/2310.14021 https://link.springer.com/article/10.1007/s00778-024-00864-x
  • Michael Shen, Muhammad Umar, Kiwan Maeng, G. Edward Suh, Udit Gupta, 16 Dec 2024, Towards Understanding Systems Trade-offs in Retrieval-Augmented Generation Model Inference, https://arxiv.org/abs/2412.11854
  • Derrick Quinn, Mohammad Nouri, Neel Patel, John Salihu, Alireza Salemi, Sukhan Lee, Hamed Zamani, Mohammad Alian, 14 Dec 2024, Accelerating Retrieval-Augmented Generation, https://arxiv.org/abs/2412.15246 (Speeding up vector databases using either approximate or exact nearest neighbor search.)
  • Harvey Bower, 2024, Debugging RAG Pipelines: Best Practices for High-Performance LLMs, https://www.amazon.com/dp/B0DNWN5RB1
  • Shige Liu, Zhifang Zeng, Li Chen, Adil Ainihaer, Arun Ramasami, Songting Chen, Yu Xu, Mingxi Wu, Jianguo Wang, 20 Jan 2025, TigerVector: Supporting Vector Search in Graph Databases for Advanced RAGs, https://arxiv.org/abs/2501.11216
  • Vasilis Mageirakos, Bowen Wu, Gustavo Alonso, 3 Mar 2025, Cracking Vector Search Indexes, https://arxiv.org/abs/2503.01823
  • Nitish Upreti, Krishnan Sundaram, Hari Sudan Sundar, Samer Boshra, Balachandar Perumalswamy, Shivam Atri, Martin Chisholm, Revti Raman Singh, Greg Yang, Subramanyam Pattipaka, Tamara Hass, Nitesh Dudhey, James Codella, Mark Hildebrand, Magdalen Manohar, Jack Moffitt, Haiyang Xu, Naren Datha, Suryansh Gupta, Ravishankar Krishnaswamy, Prashant Gupta, Abhishek Sahu, Ritika Mor, Santosh Kulkarni, Hemeswari Varada, Sudhanshu Barthwal, Amar Sagare, Dinesh Billa, Zishan Fu, Neil Deshpande, Shaun Cooper, Kevin Pilch, Simon Moreno, Aayush Kataria, Vipul Vishal, Harsha Vardhan Simhadri, 9 May 2025, Cost-Effective, Low Latency Vector Search with Azure Cosmos DB, https://arxiv.org/abs/2505.05885
  • Adel Ammar, Anis Koubaa, Omer Nacar, Wadii Boulila, 13 May 2025, Optimizing Retrieval-Augmented Generation: Analysis of Hyperparameter Impact on Performance and Efficiency, https://arxiv.org/abs/2505.08445
  • Leonardo Kuffo, Peter Boncz, 12 May 2025, Bang for the Buck: Vector Search on Cloud CPUs, https://arxiv.org/abs/2505.07621
  • Yaoqi Chen, Jinkai Zhang, Baotong Lu, Qianxi Zhang, Chengruidong Zhang, Jingjia Luo, Di Liu, Huiqiang Jiang, Qi Chen, Jing Liu, Bailu Ding, Xiao Yan, Jiawei Jiang, Chen Chen, Mingxing Zhang, Yuqing Yang, Fan Yang, Mao Yang, 5 May 2025, RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference, https://arxiv.org/abs/2505.02922
  • Michael Ryaboy,· May 16, 2025, The Most Common Vector Search Mistake Is Costing Enterprises Hundreds of Thousands, https://medium.com/kx-systems/the-most-common-vector-search-mistake-is-costing-enterprises-hundreds-of-thousands-dd1ffd0b976d
  • Jiongli Zhu, Yue Wang, Bailu Ding, Philip A. Bernstein, Vivek Narasayya, Surajit Chaudhuri, 28 Apr 2025, MINT: Multi-Vector Search Index Tuning, https://arxiv.org/abs/2504.20018

Vector Database Caching

Research papers on the use of caching to optimize vector databases:

Vector Search Optimizations

  • Chips Ahoy Capital, Jul 02, 2024, Evolution of Databases in the World of AI Apps, https://chipsahoycapital.substack.com/p/evolution-of-databases-in-the-world
  • Chirag Agrawal, Sep 20, 2024, Unlocking the Power of Efficient Vector Search in RAG Applications, https://pub.towardsai.net/unlocking-the-power-of-efficient-vector-search-in-rag-applications-c2e3a0c551d5
  • Pierre-Emmanuel Mazaré, Gergely Szilvasy, Maria Lomeli, Francisco Massa, Naila Murray, Hervé Jégou, Matthijs Douze, 12 Feb 2025, Inference-time sparse attention with asymmetric indexing, https://arxiv.org/abs/2502.08246
  • Nitish Upreti, Krishnan Sundaram, Hari Sudan Sundar, Samer Boshra, Balachandar Perumalswamy, Shivam Atri, Martin Chisholm, Revti Raman Singh, Greg Yang, Subramanyam Pattipaka, Tamara Hass, Nitesh Dudhey, James Codella, Mark Hildebrand, Magdalen Manohar, Jack Moffitt, Haiyang Xu, Naren Datha, Suryansh Gupta, Ravishankar Krishnaswamy, Prashant Gupta, Abhishek Sahu, Ritika Mor, Santosh Kulkarni, Hemeswari Varada, Sudhanshu Barthwal, Amar Sagare, Dinesh Billa, Zishan Fu, Neil Deshpande, Shaun Cooper, Kevin Pilch, Simon Moreno, Aayush Kataria, Vipul Vishal, Harsha Vardhan Simhadri, 9 May 2025, Cost-Effective, Low Latency Vector Search with Azure Cosmos DB, https://arxiv.org/abs/2505.05885
  • Leonardo Kuffo, Peter Boncz, 12 May 2025, Bang for the Buck: Vector Search on Cloud CPUs, https://arxiv.org/abs/2505.07621
  • Laxman Dhulipala, Majid Hadian, Rajesh Jayaram, Jason Lee, Vahab Mirrokni, 29 May 2024, MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings, https://arxiv.org/abs/2405.19504 (Multi-vector search optimization.)
  • Jiongli Zhu, Yue Wang, Bailu Ding, Philip A. Bernstein, Vivek Narasayya, Surajit Chaudhuri, 28 Apr 2025, MINT: Multi-Vector Search Index Tuning, https://arxiv.org/abs/2504.20018

AI Books from Aussie AI



The Sweetest Lesson: Your Brain Versus AI The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
  • Your brain is 50 times bigger than the best AI engines.
  • Truly intelligent AI will require more compute!
  • Another case of the bitter lesson?
  • Maybe it's the opposite of that: the sweetest lesson.

Get your copy from Amazon: The Sweetest Lesson



RAG Optimization RAG Optimization: Accurate and Efficient LLM Applications: new book on RAG architectures:
  • Smarter RAG
  • Faster RAG
  • Cheaper RAG
  • Agentic RAG
  • RAG reasoning

Get your copy from Amazon: RAG Optimization



Generative AI in C++ Generative AI Applications book:
  • Deciding on your AI project
  • Planning for success and safety
  • Designs and LLM architectures
  • Expediting development
  • Implementation and deployment

Get your copy from Amazon: Generative AI Applications



Generative AI in C++ Generative AI programming book:
  • Generative AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++



CUDA C++ Optimization CUDA C++ Optimization book:
  • Faster CUDA C++ kernels
  • Optimization tools & techniques
  • Compute optimization
  • Memory optimization

Get your copy from Amazon: CUDA C++ Optimization



CUDA C++ Optimization CUDA C++ Debugging book:
  • Debugging CUDA C++ kernels
  • Tools & techniques
  • Self-testing & reliability
  • Common GPU kernel bugs

Get your copy from Amazon: CUDA C++ Debugging

More AI Research

Read more about: