Aussie AI
Compound AI Architectures
-
Last Updated 30 August, 2025
-
by David Spuler, Ph.D.
Research on Compound AI Architectures
Research papers include:
- Bradley Brown, Jordan Juravsky, Ryan Ehrlich, Ronald Clark, Quoc V. Le, Christopher Ré, Azalia Mirhoseini, 31 Jul 2024, Large Language Monkeys: Scaling Inference Compute with Repeated Sampling, https://arxiv.org/abs/2407.21787 (Generating multiple answers by repeated inference queries, and then using a verifier to choose the best one, which is shown to greatly increase overall accuracy.)
- Xi Wang, Procheta Sen, Ruizhe Li, Emine Yilmaz, 31 Jul 2024, Adaptive Retrieval-Augmented Generation for Conversational Systems, https://arxiv.org/abs/2407.21712 (Deciding whether or not to include a RAG external data request in the inference of a chatbot in a multi-turn conversation.)
- Matei Zaharia, Omar Khattab, Lingjiao Chen, Jared Quincy Davis, Heather Miller, Chris Potts, James Zou, Michael Carbin, Jonathan Frankle, Naveen Rao, Ali Ghodsi, Feb 18, 2024, The Shift from Models to Compound AI Systems, https://bair.berkeley.edu/blog/2024/02/18/compound-ai-systems/
- Jared Quincy Davis, Boris Hanin, Lingjiao Chen, Peter Bailis, Ion Stoica, Matei Zaharia, 23 Jul 2024, Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design, https://www.arxiv.org/abs/2407.16831
- Sherry Ruan, Tian Zhao, 28 May 2024, JungleGPT: Designing and Optimizing Compound AI Systems for E-Commerce, https://arxiv.org/abs/2407.00038
- Cognine, 2024, Why 2024 is the Year of AI Agents and Compound AI Systems? https://cognine.com/why-2024-is-the-year-of-ai-agents-and-compound-ai-systems/
- Sean Sheng and Sherlock Xu, August 15, 2024, A Guide to Compound AI Systems, https://www.bentoml.com/blog/a-guide-to-compound-ai-systems
- Lianmin Zheng, Liangsheng Yin, Zhiqiang Xie, Chuyue Sun, Jeff Huang, Cody Hao Yu, Shiyi Cao, Christos Kozyrakis, Ion Stoica, Joseph E. Gonzalez, Clark Barrett, Ying Sheng, 6 Jun 2024 (v2), SGLang: Efficient Execution of Structured Language Model Programs, https://arxiv.org/abs/2312.07104 https://github.com/sgl-project/sglang
- An Efficient Network Orchestrator for Distributed Compound Language Model Systems Muhammad Shahir Abdurrahman, Stanford University, Stanford, California, USA, https://www.scs.stanford.edu/24sp-cs244b/projects/An_Efficient_Network_Orchestrator_for_Distributed_Compound_Language_Model_Systems.pdf
- Melissa Malec, June 5, 2024, AI Orchestration Explained: The What, Why & How for 2024, https://hatchworks.com/blog/gen-ai/ai-orchestration/
- Yanxi Chen, Yaliang Li, Bolin Ding, Jingren Zhou, 20 Jul 2024, On the Design and Analysis of LLM-Based Algorithms, https://arxiv.org/abs/2407.14788 https://github.com/modelscope/agentscope/tree/main/examples/paper_llm_based_algorithm
- Lingjiao Chen, Jared Quincy Davis, Boris Hanin, Peter Bailis, Ion Stoica, Matei Zaharia, James Zou, 4 Jun 2024 (v2), Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems, https://arxiv.org/abs/2403.02419
- Latent Space, Nov 2024, Why Compound AI + Open Source will beat Closed AI, https://www.latent.space/p/fireworks
- Gohar Irfan Chaudhry, Esha Choukse, Íñigo Goiri, Rodrigo Fonseca, Adam Belay, Ricardo Bianchini, 29 Jan 2025 (v2), Towards Resource-Efficient Compound AI Systems, https://arxiv.org/abs/2501.16634
- Lingjiao Chen, Jared Quincy Davis, Boris Hanin, Peter Bailis, Matei Zaharia, James Zou, Ion Stoica, 20 Feb 2025, Optimizing Model Selection for Compound AI Systems, https://arxiv.org/abs/2502.14815
- Rajeshkumar Bambhaniya, Abhimanyu ; Wu, Hanjiang ; Subramanian, Suvinay ; Srinivasan, Sudarshan ; Kundu, Souvik ; Yazdanbakhsh, Amir ; Elavazhagan, Midhilesh ; Kumar, Madhu ; Krishna, Tushar, April 2025, Understanding and Optimizing Multi-Stage AI Inference Pipelines, https://ui.adsabs.harvard.edu/abs/2025arXiv250409775R/abstract https://arxiv.org/abs/2504.09775
- OnlyCFO, Apr 29, 2025, Bullish: Vertical & Compound Software: In a world of AI, companies need to be more multi-product and vertical to win, https://www.onlycfo.io/p/bullish-vertical-and-compound-software
- Tomasz Tunguz, Jul 17, 2025, Hidden Technical Debt in AI, https://tomtunguz.com/hidden-technical-debt-in-ai/
- Yang Liu, Bingjie Yan, Tianyuan Zou, Jianqing Zhang, Zixuan Gu, Jianbing Ding, Xidong Wang, Jingyi Li, Xiaozhou Ye, Ye Ouyang, Qiang Yang, Ya-Qin Zhang, 24 Apr 2025, Towards Harnessing the Collaborative Power of Large and Small Models for Domain Tasks, https://arxiv.org/abs/2504.17421
- Marc Brooker, Aug 2025, LLMs as Parts of Systems, https://brooker.co.za/blog/2025/08/12/llms-as-components.html
- Deepti Raghavan, Keshav Santhanam, Muhammad Shahir Rahman, Nayani Modugula, Luis Gaspar Schroeder, Maximilien Cura, Houjun Liu, Pratiksha Thaker, Philip Levis, Matei Zaharia, 22 Jul 2025, Alto: Orchestrating Distributed Compound AI Systems with Nested Ancestry, https://arxiv.org/abs/2403.04311
- Soheil Radfar, Faezeh Maghsoodifar, Hamed Moftakhari and Hamid Moradkhani, 20 Jul 2025, Integrating Newton's Laws with deep learning for enhanced physics-informed compound flood modelling, https://arxiv.org/abs/2507.15021
- Hongzhi Zhang, Zhonglie Liu, Kun Meng, Jiameng Chen, Jia Wu, Bo Du, Di Lin, Yan Che, Wenbin Hu, 28 Jul 2025, Zero-Shot Learning with Subsequence Reordering Pretraining for Compound-Protein Interaction, https://arxiv.org/abs/2507.20925
- Nguyen Manh Son, Pham Huu Vang, Nguyen Thi Dung, Nguyen Manh Ha. Ta Thi Thao, Tran Thi Thu Thuy, Phan Minh Giang, 13 Aug 2025, In silico study on the cytotoxicity against Hela cancer cells of xanthones bioactive compounds from Garcinia cowa: QSAR based on Graph Deep Learning, Network Pharmacology, and Molecular Docking, https://arxiv.org/abs/2508.10117
AI Books from Aussie AI
![]() |
The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
Get your copy from Amazon: The Sweetest Lesson |
![]() |
RAG Optimization: Accurate and Efficient LLM Applications:
new book on RAG architectures:
Get your copy from Amazon: RAG Optimization |
![]() |
Generative AI Applications book:
Get your copy from Amazon: Generative AI Applications |
![]() |
Generative AI programming book:
Get your copy from Amazon: Generative AI in C++ |
![]() |
CUDA C++ Optimization book:
Get your copy from Amazon: CUDA C++ Optimization |
![]() |
CUDA C++ Debugging book:
Get your copy from Amazon: CUDA C++ Debugging |
More AI Research Topics
Read more about:
- 500+ LLM Inference Optimization Techniques
- What's Hot in LLM Inference Optimization in 2025?
- Inference Optimization Research
- « Research Home