Aussie AI

Reranker Optimizations

  • Last Updated 22 October, 2025
  • by David Spuler, Ph.D.

Research on Reranker Optimizations

Research papers include:

  • Vahe Aslanyan, June 11, 2024, Next-Gen Large Language Models: The Retrieval-Augmented Generation (RAG) Handbook, https://www.freecodecamp.org/news/retrieval-augmented-generation-rag-handbook/
  • Benjamin Clavié, 30 Aug 2024, rerankers: A Lightweight Python Library to Unify Ranking Methods, https://arxiv.org/abs/2408.17344 https://arxiv.org/pdf/2408.17344
  • Vivedha Elango, Sep 2024, Search in the age of AI- Retrieval methods for Beginners, https://ai.gopubby.com/search-in-the-age-of-ai-retrieval-methods-for-beginners-557621e12ded
  • Zhangchi Feng, Dongdong Kuang, Zhongyuan Wang, Zhijie Nie, Yaowei Zheng, Richong Zhang, 15 Oct 2024 (v2), EasyRAG: Efficient Retrieval-Augmented Generation Framework for Automated Network Operations, https://arxiv.org/abs/2410.10315 https://github.com/BUAADreamer/EasyRAG
  • Rama Akkiraju, Anbang Xu, Deepak Bora, Tan Yu, Lu An, Vishal Seth, Aaditya Shukla, Pritam Gundecha, Hridhay Mehta, Ashwin Jha, Prithvi Raj, Abhinav Balasubramanian, Murali Maram, Guru Muthusamy, Shivakesh Reddy Annepally, Sidney Knowles, Min Du, Nick Burnett, Sean Javiya, Ashok Marannan, Mamta Kumari, Surbhi Jha, Ethan Dereszenski, Anupam Chakraborty, Subhash Ranjan, Amina Terfai, Anoop Surya, Tracey Mercer, Vinodh Kumar Thanigachalam, Tamar Bar, Sanjana Krishnan, Samy Kilaru, Jasmine Jaksic, Nave Algarici, Jacob Liberman, Joey Conway, Sonu Nayyar, Justin Boitano, 10 Jul 2024, FACTS About Building Retrieval Augmented Generation-based Chatbots, NVIDIA Research, https://arxiv.org/abs/2407.07858
  • Andrea Matarazzo, Riccardo Torlone, 3 Jan 2025, A Survey on Large Language Models with some Insights on their Capabilities and Limitations, https://arxiv.org/abs/2501.04040 (Broad survey with many LLM topics covered from history to architectures to optimizations.)
  • Y Huang, T Gao, J Zhang, X Liu, G Wang, 2024, Adapting Large Language Models for Biomedicine though Retrieval-Augmented Generation with Documents Scoring, 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2024, pages 5770-5775, DOI: 10.1109/BIBM62325.2024.10822725, https://www.computer.org/csdl/proceedings-article/bibm/2024/10822725/23oodpoidfq (Using an LLM-based reranker for medical research documents.)
  • MS Tamber, R Pradeep, J Lin, Jan 2025, LiT and Lean: Distilling Listwise Rerankers into Encoder-Decoder Models, https://cs.uwaterloo.ca/~jimmylin/publications/Tamber_Lin_ECIR2025.pdf
  • Bharani Subramaniam, 13 February 2025, Emerging Patterns in Building GenAI Products, https://martinfowler.com/articles/gen-ai-patterns/
  • Tanay Varshney, Annie Surla, Nave Algarici, Isabel Hulseman and Cherie Wang, Mar 06, 2025, How Using a Reranking Microservice Can Improve Accuracy and Costs of Information Retrieval, https://developer.nvidia.com/blog/how-using-a-reranking-microservice-can-improve-accuracy-and-costs-of-information-retrieval/
  • Ghadir Alselwi, Hao Xue, Shoaib Jameel, Basem Suleiman, Flora D. Salim, Imran Razzak, 19 Mar 2025, Long Context Modeling with Ranked Memory-Augmented Retrieval, https://arxiv.org/abs/2503.14800
  • Jiashuo Sun, Xianrui Zhong, Sizhe Zhou, Jiawei Han, 16 May 2025 (v2), DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation https://arxiv.org/abs/2505.07233 https://github.com/GasolSun36/DynamicRAG
  • Chaitanya Sharma, 28 May 2025, Retrieval-Augmented Generation: A Comprehensive Survey of Architectures, Enhancements, and Robustness Frontiers, https://arxiv.org/abs/2506.00054
  • Andrew Brown, Muhammad Roman, Barry Devereux, 8 Aug 2025, A Systematic Literature Review of Retrieval-Augmented Generation: Techniques, Metrics, and Challenges, https://arxiv.org/abs/2508.06401
  • Latent Space, Aug 20, 2025, "RAG is Dead, Context Engineering is King" — with Jeff Huber of Chroma: What actually matters in vector databases in 2025, why “modern search for AI” is different, and how to ship systems that don’t rot as context grows, https://www.latent.space/p/chroma
  • Zekun Xu, Yudi Zhang, 22 Jul 2025, LLM-Enhanced Reranking for Complementary Product Recommendation, https://arxiv.org/abs/2507.16237
  • Xiaohua Wang, Zhenghua Wang, Xuan Gao, Feiran Zhang, Yixin Wu, Zhibo Xu, Tianyuan Shi, Zhengyuan Wang, Shizheng Li, Qi Qian, Ruicheng Yin, Changze Lv, Xiaoqing Zheng, Xuanjing Huang, 1 Jul 2024, Searching for Best Practices in Retrieval-Augmented Generation, https://arxiv.org/abs/2407.01219 Project: https://github.com/FudanDNN-NLP/RAG (Attempts to optimize the entire RAG system, including the various options for different RAG modules in the RAG pipeline, such as optimal methods for chunking, retrieval, embedding models, vector databases, prompt compression, reranking, repacking, summarizers, and other components.)
  • Atharva Nijasure, Tanya Chowdhury, James Allan, 10 Aug 2025, How Relevance Emerges: Interpreting LoRA Fine-Tuning in Reranking LLMs, https://arxiv.org/abs/2504.08780
  • Orion Weller and Kathryn Ricci and Eugene Yang and Andrew Yates and Dawn Lawrie and Benjamin Van Durme, 8 Aug 2025, Rank1: Test-Time Compute for Reranking in Information Retrieval, https://arxiv.org/abs/2502.18418
  • Bongsu Kim, 7 Aug 2025, RRRA: Resampling and Reranking through a Retriever Adapter, https://arxiv.org/abs/2508.11670
  • Haotian Chen, Qingqing Long, Meng Xiao, Xiao Luo, Wei Ju, Chengrui Wang, Xuezhi Wang, Yuanchun Zhou, Hengshu Zhu, 12 Aug 2025, SciRerankBench: Benchmarking Rerankers Towards Scientific Retrieval-Augmented Generated LLMs, https://arxiv.org/abs/2508.08742
  • Haike Xu, Tong Chen, 8 Sep 2025, Beyond Sequential Reranking: Reranker-Guided Search Improves Reasoning Intensive Retrieval, https://arxiv.org/abs/2509.07163
  • Jingjie Zheng, Aryo Pradipta Gema, Giwon Hong, Xuanli He, Pasquale Minervini, Youcheng Sun, Qiongkai Xu, 9 Sep 2025, GRADA: Graph-based Reranking against Adversarial Documents Attack, https://arxiv.org/abs/2505.07546
  • Phuong-Nam Dang, Kieu-Linh Nguyen and Thanh-Hieu Pham, 11 Sep 2025, ViRanker: A BGE-M3 & Blockwise Parallel Transformer Cross-Encoder for Vietnamese Reranking, https://arxiv.org/abs/2509.09131
  • Nicholas Pipitone, Ghita Houir Alami, Advaith Avadhanam, Anton Kaminskyi, Ashley Khoo, 16 Sep 2025, zELO: ELO-inspired Training Method for Rerankers and Embedding Models, https://arxiv.org/abs/2509.12541
  • Zihan Wang, Zihan Liang, Zhou Shao, Yufei Ma, Huangyu Dai, Ben Chen, Lingtao Mao, Chenyi Lei, Yuqing Ding, Han Li, 16 Sep 2025, InfoGain-RAG: Boosting Retrieval-Augmented Generation via Document Information Gain-based Reranking and Filtering, https://arxiv.org/abs/2509.12765

AI Books from Aussie AI



The Sweetest Lesson: Your Brain Versus AI The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
  • Your brain is 50 times bigger than the best AI engines.
  • Truly intelligent AI will require more compute!
  • Another case of the bitter lesson?
  • Maybe it's the opposite of that: the sweetest lesson.

Get your copy from Amazon: The Sweetest Lesson



RAG Optimization RAG Optimization: Accurate and Efficient LLM Applications: new book on RAG architectures:
  • Smarter RAG
  • Faster RAG
  • Cheaper RAG
  • Agentic RAG
  • RAG reasoning

Get your copy from Amazon: RAG Optimization



Generative AI in C++ Generative AI Applications book:
  • Deciding on your AI project
  • Planning for success and safety
  • Designs and LLM architectures
  • Expediting development
  • Implementation and deployment

Get your copy from Amazon: Generative AI Applications



Generative AI in C++ Generative AI programming book:
  • Generative AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++



CUDA C++ Optimization CUDA C++ Optimization book:
  • Faster CUDA C++ kernels
  • Optimization tools & techniques
  • Compute optimization
  • Memory optimization

Get your copy from Amazon: CUDA C++ Optimization



CUDA C++ Optimization CUDA C++ Debugging book:
  • Debugging CUDA C++ kernels
  • Tools & techniques
  • Self-testing & reliability
  • Common GPU kernel bugs

Get your copy from Amazon: CUDA C++ Debugging

More AI Research Topics

Read more about: