Aussie AI
Function Calling
-
Last Updated 30 August, 2025
-
by David Spuler, Ph.D.
Function calling is where LLM architectures access an external module via a function call to retrieve extra data, perform calculations, or trigger an action. It is also known as "tool usage" by the LLM, as distinct from the reverse meaning in the use of LLM tools by AI developers.
Types of function calling include external integrations for features such as:
- Data retrieval (e.g., search the internet, search real estate listings database, etc.)
- Computation tools (e.g., clocks, calculators, arithmetic)
- Action tools (e.g., the LLM calling out to "agents" that can send an email, make a booking, etc.)
Related areas of LLM research include:
Research on Function Calling
- Yechen Xu, Xinhao Kong, Tingjun Chen, Danyang Zhuo, 4 Jun 2024 (v2), Conveyor: Efficient Tool-aware LLM Serving with Tool Partial Execution, https://arxiv.org/abs/2406.00059 Code: https://github.com/conveyor-sys/conveyor (Speeding up inference by partially running tools in parallel to the LLM query procesisng, rather than sequentially after the LLM request, by detecting tool requests deep inside the decoding algorithm and starting them off immediately, before the LLM has finished generating the fully decoed output.)
- Pan Lu, 2024, Advancing Mathematical Reasoning with Language Models: A Multimodal and Knowledge-Intensive Perspective, Ph.D. Thesis, Computer Science, University of California, Los Angeles, https://escholarship.org/content/qt678864d8/qt678864d8.pdf
- Junzhi Chen, Juhao Liang, Benyou Wang, 9 May 2024, Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning, https://arxiv.org/abs/2405.05955
- Jonas Wallat, Adam Jatowt, Avishek Anand, March 2024, Temporal Blind Spots in Large Language Models, WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data Mining, Pages 683–692, https://arxiv.org/abs/2401.12078, https://doi.org/10.1145/3616855.3635818, https://dl.acm.org/doi/abs/10.1145/3616855.3635818
- Nate Kushman, Yoav Artzi, Luke Zettlemoyer, Regina Barzilay, June 2014, Learning to Automatically Solve Algebra Word Problems, P14-1026 Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), https://aclanthology.org/P14-1026/ PDF: https://aclanthology.org/P14-1026.pdf
- Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning, https://proceedings.neurips.cc/paper_files/paper/2023/file/4a47dd69242d5af908cdd5d51c971cbf-Paper-Datasets_and_Benchmarks.pdf
- Subhro Roy, Dan Roth, 20 Aug 2016 (v2), Solving General Arithmetic Word Problems, https://arxiv.org/abs/1608.01413
- Subhro Roy, Shyam Upadhyay, Dan Roth, 28 Sep 2016, Equation Parsing: Mapping Sentences to Grounded Equations, https://arxiv.org/abs/1609.08824
- Yan Wang, Xiaojiang Liu, Shuming Shi, September 2017, Deep Neural Solver for Math Word Problems D17-1088, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing Copenhagen, Denmark, https://aclanthology.org/D17-1088/ PDF: https://aclanthology.org/D17-1088.pdf
- reiinakano, November 12, 2019, Teaching a neural network to use a calculator, https://reiinakano.com/2019/11/12/solving-probability.html (Integrate SymPy calculator into the results of a neural network, by looking for the '=' sign.)
- Guoxin Chen, Minpeng Liao, Chengxi Li, Kai Fan, 6 May 2024, AlphaMath Almost Zero: process Supervision without process, https://arxiv.org/abs/2405.03553 https://github.com/MARIO-Math-Reasoning/Super_MARIO
- Sirui Hong, Yizhang Lin, Bang Liu, Bangbang Liu, Binhao Wu, Danyang Li, Jiaqi Chen, Jiayi Zhang, Jinlin Wang, Li Zhang, Lingyao Zhang, Min Yang, Mingchen Zhuge, Taicheng Guo, Tuo Zhou, Wei Tao, Wenyi Wang, Xiangru Tang, Xiangtao Lu, Xiawu Zheng, Xinbing Liang, Yaying Fei, Yuheng Cheng, Zongze Xu, Chenglin Wu, 12 Mar 2024 (v3), Data Interpreter: An LLM Agent For Data Science, https://arxiv.org/abs/2402.18679 Code: https://github.com/geekan/MetaGPT
- Zelong Li, Wenyue Hua, Hao Wang, He Zhu, Yongfeng Zhang, 4 Feb 2024 (v2), Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents, https://arxiv.org/abs/2402.00798 Code: https://github.com/agiresearch/Formal-LLM
- Qiusi Zhan, Zhixiang Liang, Zifan Ying, Daniel Kang, 25 Mar 2024 (v2), InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents, https://arxiv.org/abs/2403.02691
- Wenhu Chen, Xueguang Ma, Xinyi Wang, and William W Cohen. Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588, 2022. https://arxiv.org/abs/2211.12588 (Integrate a Python interpreter to execute the code generated by the LLM to answer the query.)
- Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, Pengfei Liu, Yiming Yang, Jamie Callan, and Graham Neubig. Pal: Program-aided language models. In International Conference on Machine Learning, pages 10764–10799. PMLR, 2023. https://arxiv.org/abs/2211.10435 Code: http://reasonwithpal.com/ (Python interpreter integrated as a tool for LLMs.)
- Intel, April 2024, Intel® Compiler First to Achieve SYCL* 2020 Conformance, https://www.intel.com/content/www/us/en/developer/articles/technical/compiler-first-full-sycl2020-conformance.html
- Long Hei Matthew Lam, Ehsan Shareghi, 1 Jun 2024, A Closer Look at Logical Reasoning with LLMs: The Choice of Tool Matters, https://arxiv.org/abs/2406.00284 (Using symbolic solvers with LLMs.)
- M Keber, I Grubišic, A Barešic, A Jovic, 2024, A Review on Neuro-symbolic AI Improvements to Natural Language Processing, https://www.researchgate.net/profile/Alan-Jovic/publication/380911364_A_Review_on_Neuro-symbolic_AI_Improvements_to_Natural_Language_Processing/links/6655c0ec22a7f16b4f51fb2f/A-Review-on-Neuro-symbolic-AI-Improvements-to-Natural-Language-Processing.pdf
- Yubo Ma, Zhibin Gou, Junheng Hao, Ruochen Xu, Shuohang Wang, Liangming Pan, Yujiu Yang, Yixin Cao, Aixin Sun, Hany Awadalla, Weizhu Chen, 21 Feb 2024 (v2), SciAgent: Tool-augmented Language Models for Scientific Reasoning, https://arxiv.org/abs/2402.11451
- Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu, 2023, ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings, Part of Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Main Conference Track, https://proceedings.neurips.cc/paper_files/paper/2023/hash/8fd1a81c882cd45f64958da6284f4a3f-Abstract-Conference.html
- Mengkang Hu, Yao Mu, Xinmiao Yu, Mingyu Ding, Shiguang Wu, Wenqi Shao, Qiguang Chen, Bin Wang, Yu Qiao, and Ping Luo. 2023a. Tree-planner: Efficient close-loop task planning with large language models. arXiv preprint arXiv:2310.08582. https://arxiv.org/abs/2310.08582
- Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik R Narasimhan, and Shunyu Yao. 2023. Reflexion: Language agents with verbal reinforcement learning. In Thirty-seventh Conference on Neural Information Processing Systems. https://arxiv.org/abs/2303.11366
- Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, et al. 2023b. ToolLLM: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789. https://arxiv.org/abs/2307.16789
- Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, XuChen, Yankai Lin, et al. 2023c. A survey on large language model based autonomous agents. arXiv preprint arXiv:2308.11432. https://arxiv.org/abs/2308.11432
- Aaron Parisi, Yao Zhao, and Noah Fiedel. Talm: Tool augmented language models. arXiv preprint arXiv:2205.12255, 2022. https://arxiv.org/abs/2205.12255
- Joy He-Yueya, Gabriel Poesia, Rose E. Wang, and Noah D. Goodman. Solving math word problems by combining language models with symbolic solvers. ArXiv, abs/2304.09102, 2023. https://arxiv.org/abs/2304.09102
- Shima Imani, Liang Du, and H. Shrivastava. Mathprompter: Mathematical reasoning using large language models. ArXiv, abs/2303.05398, 2023. https://arxiv.org/abs/2303.05398
- Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhijun Tu, Kai Han, Hailin Hu, Dacheng Tao, 5 Feb 2024. A Survey on Transformer Compression. https://arxiv.org/abs/2402.05964 (Model compression survey paper with focus on pruning, quantization, knowledge distillation, and efficient architecture design.)
- Simranjit Singh, Andreas Karatzas, Michael Fore, Iraklis Anagnostopoulos, Dimitrios Stamoulis, 7 May 2024, An LLM-Tool Compiler for Fused Parallel Function Calling, https://arxiv.org/abs/2405.17438
- Julian Yip, Apr 2, 2024, Build Autonomous AI Agents with Function Calling: Transform your chatbot into an agent that can interact with external APIs, https://towardsdatascience.com/build-autonomous-ai-agents-with-function-calling-0bb483753975 (Implement agents via models that output a JSON object that describes the API to call and the parmaeters to send.)
- Adva Nakash Peleg, May 30, 2024, An LLM Journey: From POC to Production, https://medium.com/cyberark-engineering/an-llm-journey-from-poc-to-production-6c5ec6a172fb
- Yu Gu, Yiheng Shu, Hao Yu, Xiao Liu, Yuxiao Dong, Jie Tang, Jayanth Srinivasa, Hugo Latapie, Yu Su, 22 Feb 2024, Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments, https://arxiv.org/abs/2402.14672
- Yaobo Liang, Chenfei Wu , Ting Song , Wenshan Wu , Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan, March 2023, TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs, https://arxiv.org/pdf/2303.16434.pdf
- kipply's blog, 2023-03-30, Transformer Taxonomy (the last lit review), https://kipp.ly/transformer-taxonomy/ (Papers for all the Transformer architectures and milestone papers for the major optimization improvements on them.)
- Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, John Schulman, 1 Jun 2022 (v3), WebGPT: Browser-assisted question-answering with human feedback, https://arxiv.org/abs/2112.09332
- Tianlin Shi, Andrej Karpathy, Linxi Fan, Jonathan Hernandez, Percy Liang, 2017, World of Bits: An Open-Domain Platform for Web-Based Agents, Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3135-3144, https://proceedings.mlr.press/v70/shi17a.html
- Peter C Humphreys, David Raposo, Toby Pohlen, Gregory Thornton, Rachita Chhaparia, Alistair Muldal, Josh Abramson, Petko Georgiev, Alex Goldin, Adam Santoro, Timothy Lillicrap, 11 Nov 2022 (v2), A data-driven approach for learning to control computers, https://arxiv.org/abs/2202.08137
- Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom, 9 Feb 2023, Toolformer: Language Models Can Teach Themselves to Use Tools, https://arxiv.org/abs/2302.04761
- OpenAI, 2024, Function calling, https://platform.openai.com/docs/guides/function-calling
- Cobus Greyling, June 16, 2023, Practical Examples of OpenAI Function Calling, https://cobusgreyling.medium.com/practical-examples-of-openai-function-calling-a6419dc38775
- University of California, Berkeley, 2024, Berkeley Function-Calling Leaderboard, https://gorilla.cs.berkeley.edu/leaderboard.html https://huggingface.co/datasets/gorilla-llm/Berkeley-Function-Calling-Leaderboard
- Wes Brewer, Ana Gainaru, Frédéric Suter, Feiyi Wang, Murali Emani, Shantenu Jha, 20 Jun 2024, AI-coupled HPC Workflow Applications, Middleware and Performance, (Examines integrations of various workflows into LLMs.) https://arxiv.org/abs/2406.14315
- Aarushi Kansal, Chapter 3: Chains, Tools and Agents Building Generative AI-Powered Apps: A Hands-on Guide for Developers, Apress, https://www.amazon.com/Building-Generative-AI-Powered-Apps-Hands-ebook/dp/B0CTXXP1S4/
- Vishal Rajput, Apr 11, 2024, What’s next for AI: AI agentic workflows? https://medium.com/aiguys/next-for-llms-and-rag-ai-agentic-workflows-1869ba0a6796
- Shishir Patil, May 10, 2024, Teaching Large Language Models to Use Tools at Scale, Ph.D. Thesis, Electrical Engineering and Computer Sciences, University of California, Berkeley, Technical Report No. UCB/EECS-2024-85, http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-85.html https://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-85.pdf
- Xi Wang, Procheta Sen, Ruizhe Li, Emine Yilmaz, 31 Jul 2024, Adaptive Retrieval-Augmented Generation for Conversational Systems, https://arxiv.org/abs/2407.21712 (Deciding whether or not to include a RAG external data request in the inference of a chatbot in a multi-turn conversation.)
- Michael Nuñez, July 18, 2024, Groq’s open-source Llama AI model tops leaderboard, outperforming GPT-4o and Claude in function calling, https://venturebeat.com/ai/groq-open-source-llama-ai-model-tops-leaderboard-outperforming-gpt-4o-and-claude-in-function-calling/
- Thomas Reid, Jul 31, 2024, Ollama’s Latest Update: Tool Use: Everything you need to know about function calling in Ollama https://ai.gopubby.com/ollamas-latest-update-tool-use-7b809e15be5c
- Jiarui Lu, Thomas Holleis, Yizhe Zhang, Bernhard Aumayer, Feng Nan, Felix Bai, Shuang Ma, Shen Ma, Mengyu Li, Guoli Yin, Zirui Wang, Ruoming Pang, 8 Aug 2024, ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities, https://arxiv.org/abs/2408.04682 Code: https://github.com/apple/ToolSandbox
- Reyna Abhyankar, Zijian He, Vikranth Srivatsa, Hao Zhang, Yiying Zhang, July 2024, InferCept: Efficient Intercept Support for Augmented Large Language Model Inference, Proceedings of the 41st International Conference on Machine Learning, PMLR 235:81-95, 2024, https://proceedings.mlr.press/v235/abhyankar24a.html PDF: https://raw.githubusercontent.com/mlresearch/v235/main/assets/abhyankar24a/abhyankar24a.pdf
- Yu Du, Fangyun Wei, Hongyang Zhang, July 2024, AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls, Proceedings of the 41st International Conference on Machine Learning, PMLR 235:11812-11829, 2024, https://proceedings.mlr.press/v235/du24h.html PDF: https://raw.githubusercontent.com/mlresearch/v235/main/assets/du24h/du24h.pdf
- MemGPT, Aug 2024, Adding custom tools to MemGPT, https://memgpt.readme.io/docs/adding-custom-tools-to-memgpt
- Asim Biswal, Liana Patel, Siddarth Jha, Amog Kamsetty, Shu Liu, Joseph E. Gonzalez, Carlos Guestrin, Matei Zaharia, 27 Aug 2024, Text2SQL is Not Enough: Unifying AI and Databases with TAG, https://arxiv.org/abs/2408.14717 https://github.com/TAG-Research/TAG-Bench
- Yaroslav Zharov, Yury Khudyakov, Evgeniia Fedotova, Evgeny Grigorenko, Egor Bogomolov, 18 Feb 2024, Tool-Augmented LLMs as a Universal Interface for IDEs, https://arxiv.org/abs/2402.11635
- Lutfi Eren Erdogan, Nicholas Lee, Siddharth Jha, Sehoon Kim, Ryan Tabrizi, Suhong Moon, Coleman Hooper, Gopala Anumanchipalli, Kurt Keutzer, Amir Gholami, 1 Sep 2024, TinyAgent: Function Calling at the Edge, https://arxiv.org/abs/2409.00608 https://github.com/SqueezeAILab/TinyAgent
- Suhong Moon, Siddharth Jha, Lutfi Eren Erdogan, Sehoon Kim, Woosang Lim, Kurt Keutzer, Amir Gholami, 2 Sep 2024, Efficient and Scalable Estimation of Tool Representations in Vector Space, https://arxiv.org/abs/2409.02141 https://github.com/SqueezeAILab/Tool2Vec (Using synthetic data to train tool usage decision models.)
- Xiaoxia Liu, Jingyi Wang, Jun Sun, Xiaohan Yuan, Guoliang Dong, Peng Di, Wenhai Wang, Dongxia Wang, 21 Nov 2023, Prompting Frameworks for Large Language Models: A Survey, https://arxiv.org/abs/2311.12785
- Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Samrat Mondal, Aman Chadha, 5 Feb 2024, A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications, https://arxiv.org/abs/2402.07927
- Yupu Hao, Pengfei Cao, Zhuoran Jin, Huanxuan Liao, Yubo Chen, Kang Liu, Jun Zhao, 23 Sep 2024 (v2), CITI: Enhancing Tool Utilizing Ability in Large Language Models without Sacrificing General Performance, https://arxiv.org/abs/2409.13202
- Carl Franzen, September 27, Cohere updates APIs to make it easier for devs to switch from other models, https://venturebeat.com/ai/cohere-updates-apis-to-make-it-easier-for-devs-to-switch-from-other-models/
- Renxi Wang, Xudong Han, Lei Ji, Shu Wang, Timothy Baldwin, Haonan Li, 8 Oct 2024 (v2), ToolGen: Unified Tool Retrieval and Calling via Generation, https://arxiv.org/abs/2410.03439
- Ke Wang, Jiahui Zhu, Minjie Ren, Zeming Liu, Shiwei Li, Zongye Zhang, Chenkai Zhang, Xiaoyu Wu, Qiqi Zhan, Qingjie Liu, Yunhong Wang, 16 Oct 2024, A Survey on Data Synthesis and Augmentation for Large Language Models, https://arxiv.org/abs/2410.12896
- Yakun Zhu, Shaohang Wei, Xu Wang, Kui Xue, Xiaofan Zhang, Shaoting Zhang, 17 Oct 2024, MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling, https://arxiv.org/abs/2410.13610
- Elias Lumer, Vamse Kumar Subbiah, James A. Burke, Pradeep Honaganahalli Basavaraju, Austin Huber, 22 Oct 2024 (v2), Toolshed: Scale Tool-Equipped Agents with Advanced RAG-Tool Fusion and Tool Knowledge Bases, https://arxiv.org/abs/2410.14594
- A. Singh, A. Ehtesham, S. Kumar and T. T. Khoei, "Enhancing AI Systems with Agentic Workflows Patterns in Large Language Model," 2024 IEEE World AI IoT Congress (AIIoT), Seattle, WA, USA, 2024, pp. 527-532, doi: 10.1109/AIIoT61789.2024.10578990. https://ieeexplore.ieee.org/abstract/document/10578990
- Chawla, Chhavi; Chatterjee, Siddharth; Gadadinni, Sanketh Siddanna; Verma, Pulkit; Banerjee, Sourav, 2024, Agentic AI: The building blocks of sophisticated AI business applications, Journal of AI, Robotics & Workplace Automation, Volume 3 / Number 3 / Summer 2024, pp. 1-15(15), Henry Stewart Publications, DOI: https://doi.org/10.69554/XEHZ1946 https://www.ingentaconnect.com/content/hsp/airwa/2024/00000003/00000003/art00001
- Dawei Gao, Zitao Li, Xuchen Pan, Weirui Kuang, Zhijian Ma, Bingchen Qian, Fei Wei, Wenhao Zhang, Yuexiang Xie, Daoyuan Chen, Liuyi Yao, Hongyi Peng, Zeyu Zhang, Lin Zhu, Chen Cheng, Hongzhu Shi, Yaliang Li, Bolin Ding, Jingren Zhou, 20 May 2024 (v2), AgentScope: A Flexible yet Robust Multi-Agent Platform, https://arxiv.org/abs/2402.14034 https://github.com/modelscope/agentscope
- Michael Nuñez, November 4, 2024, UC San Diego, Tsinghua University researchers just made AI way better at knowing when to ask for help, https://venturebeat.com/ai/uc-san-diego-tsinghua-university-researchers-just-made-ai-way-better-at-knowing-when-to-ask-for-help/
- Jerry Huang, Prasanna Parthasarathi, Mehdi Rezagholizadeh, Sarath Chandar, 14 Apr 2024, Towards Practical Tool Usage for Continually Learning LLMs, https://arxiv.org/abs/2404.09339
- Amy Marks, Jun 11, 2024, Clarifying Function Calling / Tool Use in LLMs, https://medium.com/@aevalone/clarifying-function-calling-tool-use-in-llms-6511af510f99
- Bohan Lyu, Yadi Cao, Duncan Watson-Parris, Leon Bergen, Taylor Berg-Kirkpatrick, Rose Yu, 1 Nov 2024, Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation, https://arxiv.org/abs/2411.00412
- Anthropic, 26 Nov 2024, Introducing the Model Context Protocol, https://www.anthropic.com/news/model-context-protocol
- Varatheepan Paramanayakam, Andreas Karatzas, Iraklis Anagnostopoulos, Dimitrios Stamoulis, 23 Nov 2024, Less is More: Optimizing Function Calling for LLM Execution on Edge Devices, https://arxiv.org/abs/2411.15399
- Soh, J., Singh, P. (2024). Semantic Kernel, Plugins, and Function Calling. In: Data Science Solutions on Azure. Apress, Berkeley, CA. https://doi.org/10.1007/979-8-8688-0914-9_7 https://link.springer.com/chapter/10.1007/979-8-8688-0914-9_7
- Chris Sypherd, Vaishak Belle, 5 Dec 2024, Practical Considerations for Agentic LLM Systems, https://arxiv.org/abs/2412.04093
- Zhi-Yuan Chen, Shiqi Shen, Guangyao Shen, Gong Zhi, Xu Chen, and Yankai Lin. 2024. Towards Tool Use Alignment of Large Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 1382–1400, Miami, Florida, USA. Association for Computational Linguistics. https://aclanthology.org/2024.emnlp-main.82/ https://aclanthology.org/2024.emnlp-main.82.pdf
- Damien de Mijolla, Wen Yang, Philippa Duckett, Christopher Frye, Mark Worrall, 8 Dec 2024, Language hooks: a modular framework for augmenting LLM reasoning that decouples tool usage from the model and its prompt, https://arxiv.org/abs/2412.05967
- In Gim, Seung-seob Lee, Lin Zhong, 9 Dec 2024, Asynchronous LLM Function Calling, https://arxiv.org/abs/2412.07017 (Overlap LLM computations and tool execution.)
- Outlore, Dec 14, 2024, Reflections on building with Model Context Protocol (MCP), https://outlore.dev/blog/model-context-protocol/
- Andrew Zuo, Dec 13, 2024, AI Assistants Are Going To Get Really Good, https://andrewzuo.com/ai-assistants-are-going-to-get-really-good-d6e6a026e588
- Wenchao Xu, Jinyu Chen, Peirong Zheng, Xiaoquan Yi, Tianyi Tian, Wenhui Zhu, Quan Wan, Haozhao Wang, Yunfeng Fan, Qinliang Su, Xuemin Shen, https://arxiv.org/abs/2412.13437 18 Dec 2024, Deploying Foundation Model Powered Agent Services: A Survey, (A survey of not just deployment, but many inference optimization techniques.)
- Qwen: An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, Huan Lin, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jingren Zhou, Junyang Lin, Kai Dang, Keming Lu, Keqin Bao, Kexin Yang, Le Yu, Mei Li, Mingfeng Xue, Pei Zhang, Qin Zhu, Rui Men, Runji Lin, Tianhao Li, Tingyu Xia, Xingzhang Ren, Xuancheng Ren, Yang Fan, Yang Su, Yichang Zhang, Yu Wan, Yuqiong Liu, Zeyu Cui, Zhenru Zhang, Zihan Qiu (additional authors not shown), 19 Dec 2024, Qwen2.5 Technical Report, https://arxiv.org/abs/2412.15115
- Dian Yu, Yuheng Zhang, Jiahao Xu, Tian Liang, Linfeng Song, Zhaopeng Tu, Haitao Mi, Dong Yu, 22 Dec 2024, Teaching LLMs to Refine with Tools, https://arxiv.org/abs/2412.16871
- Xiangjue Dong, Maria Teleki, James Caverlee, 18 Dec 2024, A Survey on LLM Inference-Time Self-Improvement, https://arxiv.org/abs/2412.14352 https://github.com/dongxiangjue/Awesome-LLM-Self-Improvement (Broad survey of reasoning improvement methods from multi-step inference to RALM to decoding algorithms.)
- Florian Dietz, Dietrich Klakow, 1 Jan 2025, IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently, https://arxiv.org/abs/2501.00684
- Violet Xiang, Charlie Snell, Kanishk Gandhi, Alon Albalak, Anikait Singh, Chase Blagden, Duy Phung, Rafael Rafailov, Nathan Lile, Dakota Mahan, Louis Castricato, Jan-Philipp Franken, Nick Haber, Chelsea Finn, 8 Jan 2025, Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought, https://arxiv.org/abs/2501.04682
- Julia Wiesinger, Patrick Marlow and Vladimir Vuskovic, Sep 2024, Agents, Google Whitepaper, https://www.kaggle.com/whitepaper-agents
- S. Song et al., 2025, "How to Bridge the Gap between Modalities: Survey on Multimodal Large Language Model," in IEEE Transactions on Knowledge and Data Engineering, doi: 10.1109/TKDE.2025.3527978. https://ieeexplore.ieee.org/abstract/document/10841938/
- Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen, 21 Feb 2024 (v4), ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving, https://arxiv.org/abs/2309.17452
- Bohan Lyu, Xin Cong, Heyang Yu, Pan Yang, Yujia Qin, Yining Ye, Yaxi Lu, Zhong Zhang, Yukun Yan, Yankai Lin, Zhiyuan Liu, Maosong Sun, 28 Dec 2023, GitAgent: Facilitating Autonomous Agent with GitHub by Tool Extension, https://arxiv.org/abs/2312.17294
- Tong Xiao, Jingbo Zhu, 16 Jan 2025, Foundations of Large Language Models, https://arxiv.org/abs/2501.09223 (Huge 230 page paper on many topics such as training, prompting, alignment, and long context.)
- Xinzhe Li, Jan 2025, A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning, Proceedings of the 31st International Conference on Computational Linguistics, pages 9760–9779, January 19–24, 2025. ©2025 Association for Computational Linguistics, https://aclanthology.org/2025.coling-main.652.pdf https://github.com/xinzhel/LLM-Agent-Survey
- Connor Shorten, Charles Pierse, Thomas Benjamin Smith, Karel D'Oosterlinck, Tuana Celik, Erika Cardenas, Leonie Monigatti, Mohd Shukri Hasan, Edward Schmuhl, Daniel Williams, Aravind Kesiraju, Bob van Luijt, 23 Jan 2025, Querying Databases with Function Calling, https://arxiv.org/abs/2502.00032
- Jiali Cheng, Hadi Amiri, 3 Feb 2025. Tool Unlearning for Tool-Augmented LLMs, https://arxiv.org/abs/2502.01083 (Unlearning theory applied to tool usage.)
- Wenjun Li, Dexun Li, Kuicai Dong, Cong Zhang, Hao Zhang, Weiwen Liu, Yasheng Wang, Ruiming Tang, Yong Liu, 18 Feb 2025, Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger, https://arxiv.org/abs/2502.12961 (Examining the decision whether or not to launch a tool, and the inefficiency of non-needed tool calls.)
- C Winston, R Just, Feb 2025, A Taxonomy of Failures in Tool-Augmented LLMs, https://homes.cs.washington.edu/~rjust/publ/tallm_testing_ast_2025.pdf
- Xuan Zhang, Yongliang Shen, Zhe Zheng, Linjuan Wu, Wenqi Zhang, Yuchen Yan, Qiuying Peng, Jun Wang, Weiming Lu, 3 Mar 2025, AskToAct: Enhancing LLMs Tool Use via Self-Correcting Clarification, https://arxiv.org/abs/2503.01940
- Hongshen Xu, Zihan Wang, Zichen Zhu, Lei Pan, Xingyu Chen, Lu Chen, Kai Yu, 9 Mar 2025, Alignment for Efficient Tool Calling of Large Language Models, https://arxiv.org/abs/2503.06708
- Anthropic, 14 Mar 2025, Token-saving updates on the Anthropic API, https://www.anthropic.com/news/token-saving-updates (Prompt caching, excluding cached responses from rate limits, and token-efficient tool calling.)
- Mengsong Wu, Tong Zhu, Han Han, Xiang Zhang, Wenbiao Shao, Wenliang Chen, 21 Mar 2025, Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models, https://arxiv.org/abs/2503.16779 https://github.com/fairyshine/Chain-of-Tools
- Ali Forootani, 22 Mar 2025, A Survey on Mathematical Reasoning and Optimization with Large Language Models, https://arxiv.org/abs/2503.17726
- Aiyao He, Sijia Cui, Shuai Xu, Yanna Wang, Bo Xu, 13 May 2025, TUMS: Enhancing Tool-use Abilities of LLMs with Multi-structure Handlers, https://arxiv.org/abs/2505.08402
- Xu Huang, Yuefeng Huang, Weiwen Liu, Xingshan Zeng, Yasheng Wang, Ruiming Tang, Hong Xie, Defu Lian, 7 May 2025, Advancing and Benchmarking Personalized Tool Invocation for LLMs, https://arxiv.org/abs/2505.04072 https://github.com/hyfshadow/PTBench
- Wang et. al., 2025, Function Calling in Large Language Models: Industrial Practices, Challenges, and Future Directions, https://openreview.net/pdf?id=LNxVGPedFW
- Cameron R. Wolfe, Ph.D., Jun 09, 2025, AI Agents from First Principles: Understanding AI agents by building upon the most basic concepts of LLMs, https://cameronrwolfe.substack.com/p/ai-agents
- Beong-woo Kwak, Minju Kim, Dongha Lim, Hyungjoo Chae, Dongjin Kang, Sunghwan Kim, Dongil Yang, Jinyoung Yeo, 29 May 2025, ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term Interactions, https://arxiv.org/abs/2505.23662 https://github.com/bwookwak/ToolHaystack
- Dex Horthy, June 2025 (accessed), 12-Factor Agents - Principles for building reliable LLM applications, https://github.com/humanlayer/12-factor-agents?tab=readme-ov-file
- Bohan Yao, Vikas Yadav, 25 Jul 2025, A Toolbox, Not a Hammer -- Multi-TAG: Scaling Math Reasoning with Multi-Tool Aggregation, https://arxiv.org/abs/2507.18973 (Launch multiple tools and aggregate the results)
- Bin Wu, Edgar Meij, Emine Yilmaz, Aug 2025, AJoint Optimization Framework for Enhancing Efficiency of Tool Utilization in LLM Agents, Findings of the Association for Computational Linguistics: ACL 2025, pages 22361–22373 July 27- August 1, 2025, https://aclanthology.org/2025.findings-acl.1149.pdf
- Zijing Zhang, Zhanpeng Chen, He Zhu, Ziyang Chen, Nan Du, Xiaolong Li, Aug 2025, ToolExpNet: Optimizing Multi-Tool Selection in LLMs with Similarity and Dependency-Aware Experience Networks, Findings of the Association for Computational Linguistics: ACL 2025, pages 15706–15722 July 27- August 1, 2025, https://aclanthology.org/2025.findings-acl.811.pdf
- Lingrui Mei, Jiayu Yao, Yuyao Ge, Yiwei Wang, Baolong Bi, Yujun Cai, Jiazhi Liu, Mingyu Li, Zhong-Zhi Li, Duzhen Zhang, Chenlin Zhou, Jiayi Mao, Tianze Xia, Jiafeng Guo, Shenghua Liu, 21 Jul 2025 (v2), A Survey of Context Engineering for Large Language Models, https://arxiv.org/abs/2507.13334
- Xu, W., Huang, C., Gao, S. et al. LLM-Based Agents for Tool Learning: A Survey. Data Sci. Eng. (2025). https://doi.org/10.1007/s41019-025-00296-9 https://link.springer.com/article/10.1007/s41019-025-00296-9
- Yan Jiang, Hao Zhou, LiZhong GU, Ai Han, TianLong Li, 24 Jun 2025, NaviAgent: Bilevel Planning on Tool Dependency Graphs for Function Calling, https://arxiv.org/abs/2506.19500
- Xiaoyu Tan, Bin Li, Xihe Qiu, Chao Qu, Wei Chu, Yinghui Xu, and Yuan Qi. 2025. Meta-Agent-Workflow: Streamlining Tool Usage in LLMs through Workflow Construction, Retrieval, and Refinement. In Companion Proceedings of the ACM on Web Conference 2025 (WWW '25). Association for Computing Machinery, New York, NY, USA, 458–467. https://doi.org/10.1145/3701716.3715247 https://dl.acm.org/doi/abs/10.1145/3701716.3715247
- Sebastian Nicolas Müller, May 23, 2025, Infinite tool use, https://snimu.github.io/2025/05/23/infinite-tool-use.html
- J Vigel, R Cai, ECA Neema, A Liao, K Zhu, S O'Brien, 2025, Self Knowledge-Tracing for Tool Use (SKT-Tool): Helping LLM Agents Understand Their Capabilities in Tool Use, NAACL2025, The 5th Workshop on Insights from Negative Results in NLP, https://aclanthology.org/anthology-files/anthology-files/pdf/insights/2025.insights-1.pdf#page=155
- Cheng Qian, Emre Can Acikgoz, Qi He, Hongru Wang, Xiusi Chen, Dilek Hakkani-Tür, Gokhan Tur, Heng Ji, 16 Apr 2025, ToolRL: Reward is All Tool Learning Needs, https://arxiv.org/abs/2504.13958
- Geoffrey Huntley AGENT.md: The Universal Agent Configuration File, July 2025 Request for Comments, https://ampcode.com/AGENT.md
- Ningning Wang, Xavier Hu, Pai Liu, He Zhu, Yue Hou, Heyuan Huang, Shengyu Zhang, Jian Yang, Jiaheng Liu, Ge Zhang, Changwang Zhang, Jun Wang, Yuchen Eleanor Jiang, Wangchunshu Zhou, 24 Jul 2025, Efficient Agents: Building Effective Agents While Reducing Cost, https://arxiv.org/pdf/2508.02694 https://github.com/OPPO-PersonalAI/OAgents
- Peter Wildeford, Aug 08, 2025, GPT-5: a small step for intelligence, a giant leap for normal people: GPT-5 focuses on where the money is - everyday users, not AI elites, https://peterwildeford.substack.com/p/gpt-5-a-small-step-for-intelligence
- Anoop Kotha, Julian Lee, Eric Zakariasson, Anoop Kotha, Julian Lee, OpenAI, Aug 2025, GPT-5 prompting guide, https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide
- Will Fein, Ryan J. Horwitz, John E. Brown III, Amit Misra, Felipe Oviedo, Kevin White, Juan M. Lavista Ferres, Samuel K. Wasser, 13 Aug 2025, AI-Driven Detection and Analysis of Handwriting on Seized Ivory: A Tool to Uncover Criminal Networks in the Illicit Wildlife Trade, https://arxiv.org/abs/2508.10219
- Muhammad Ahmad, Fida Ullah, Muhammad Usman, Ildar Batyrshin, Grigori Sidorov, 12 Aug 2025, SABIA: An AI-Powered Tool for Detecting Opioid-Related Behaviors on Social Media, https://arxiv.org/abs/2508.10046
- Xingshan Zeng, Weiwen Liu, Xu Huang, Zezhong Wang, Lingzhi Wang, Liangyou Li, Yasheng Wang, Lifeng Shang, Xin Jiang, Ruiming Tang, Qun Liu, 14 Aug 2025, ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning, https://arxiv.org/abs/2504.01400
- Athanasios Davvetas, Xenia Ziouvelou, Ypatia Dami, Alexis Kaponis, Konstantina Giouvanopoulou, Michael Papademas, 23 Jul 2025, TAI Scan Tool: A RAG-Based Tool With Minimalistic Input for Trustworthy AI Self-Assessment, https://arxiv.org/abs/2507.17514
- Zhao Song, Song Yue, Jiahao Zhang, 23 Jul 2025, Thinking Isn't an Illusion: Overcoming the Limitations of Reasoning Models via Tool Augmentations, https://arxiv.org/abs/2507.17699
- Arduin Findeis, Floris Weers, Guoli Yin, Ke Ye, Ruoming Pang, Tom Gunter, 22 Jul 2025, Can External Validation Tools Improve Annotation Quality for LLM-as-a-Judge?, https://arxiv.org/abs/2507.17015
- Po-Yen Wu, Cheng-Yu Kuo, Yuki Kadokawa, and Takamitsu Matsubara, 23 Jul 2025, Prolonging Tool Life: Learning Skillful Use of General-purpose Tools through Lifespan-guided Reinforcement Learning, https://arxiv.org/abs/2507.17275
- Xiaoyi Zhang, Zhaoyang Jia, Zongyu Guo, Jiahao Li, Bin Li, Houqiang Li, Yan Lu, 23 Jul 2025, Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding, https://arxiv.org/abs/2505.18079
- Timothy Tin Long Yu, Mahdi Mostajabdaveh, Jabo Serge Byusa, Rindra Ramamonjison, Giuseppe Carenini, Kun Mao, Zirui Zhou, Yong Zhang, 23 Jul 2025, SMARTAPS: Tool-augmented LLMs for Operations Management, https://arxiv.org/abs/2507.17927
- Alex Liu, Lief Esbenshade, Shawon Sarkar, Victor Tian, Zachary Zhang, Kevin He, Min Sun, 23 Jul 2025, Decoding Instructional Dialogue: Human-AI Collaborative Analysis of Teacher Use of AI Tool at Scale, https://arxiv.org/abs/2507.17985
- Haozhe Wang, Long Li, Chao Qu, Fengming Zhu, Weidi Xu, Wei Chu, Fangzhen Lin, 18 Jul 2025, To Code or not to Code? Adaptive Tool Integration for Math Language Models via Expectation-Maximization, https://arxiv.org/abs/2502.00691
- Jiale Liu, Huan Wang, Yue Zhang, Xiaoyu Luo, Jiaxiang Hu, Zhiliang Liu, Min Xie, 20 Jul 2025, InsightX Agent: An LMM-based Agentic Framework with Integrated Tools for Reliable X-ray NDT Analysis, https://arxiv.org/abs/2507.14899
- Richard M. Charles, James H. Curry and Richard B. Charles, 15 Jul 2025, Mitigating Trojanized Prompt Chains in Educational LLM Use Cases: Experimental Findings and Detection Tool Design, https://arxiv.org/abs/2507.14207
- Qian Xiong and Yuekai Huang and Ziyou Jiang and Zhiyuan Chang and Yujia Zheng and Tianhao Li and Mingyang Li, 21 Jul 2025, Butterfly Effects in Toolchains: A Comprehensive Analysis of Failed Parameter Filling in LLM Tool-Agent Systems, https://arxiv.org/abs/2507.15296
- Jubin Abhishek Soni, Amit Anand, Rajesh Kumar Pandey, Aniket Abhishek Soni, 19 Jul 2025, Dynamic Context Tuning for Retrieval-Augmented Generation: Enhancing Multi-Turn Planning and Tool Adaptation, https://arxiv.org/abs/2506.11092
- Shiqing Fan, Xichen Ding, Liang Zhang, Linjian Mo, 11 Aug 2025, MCPToolBench++: A Large Scale AI Agent Model Context Protocol MCP Tool Use Benchmark, https://arxiv.org/abs/2508.07575
- Wenpeng Xing, Zhipeng Chen, Changting Lin, Meng Han, 11 Aug 2025, HGMF: A Hierarchical Gaussian Mixture Framework for Scalable Tool Invocation within the Model Context Protocol, https://arxiv.org/abs/2508.07602
- Luyao Zhuang, Qinggang Zhang, Huachi Zhou, Juhua Liu, Qing Li, Xiao Huang, 11 Aug 2025, LoSemB: Logic-Guided Semantic Bridging for Inductive Tool Retrieval, https://arxiv.org/abs/2508.07690
- Keyan Ding, Jing Yu, Junjie Huang, Yuchen Yang, Qiang Zhang, Huajun Chen, 27 Jul 2025, SciToolAgent: A Knowledge Graph-Driven Scientific Agent for Multi-Tool Integration, https://arxiv.org/abs/2507.20280
- Vicente Ramos (1), Sundous Hussein (1), Mohamed Abdel-Hafiz (1), Arunangshu Sarkar (2), Weixuan Liu (2), Katerina J. Kechris (2), Russell P. Bowler (3), Leslie Lange (4), Farnoush Banaei-Kashani (1) ((1) Department of Computer Science and Engineering, University of Colorado Denver, Denver, USA, (2) Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, USA, (3) Genomic Medicine Institute, Cleveland Clinic, Cleveland, USA, (4) Division of Biomedical Informatics and Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, USA), 27 Jul 2025, BioNeuralNet: A Graph Neural Network based Multi-Omics Network Data Analysis Tool, https://arxiv.org/abs/2507.20440
- Nicola Croce, Tobin South, 26 Jul 2025, Trivial Trojans: How Minimal MCP Servers Enable Cross-Tool Exfiltration of Sensitive Data, https://arxiv.org/abs/2507.19880
- Harsh Purohit, Tomoya Nishida, Kota Dohi, Takashi Endo, and Yohei Kawaguchi, 28 Jul 2025, MIMII-Agent: Leveraging LLMs with Function Calling for Relative Evaluation of Anomalous Sound Detection, https://arxiv.org/abs/2507.20666
- Nicholas Botti (Federal Reserve Board), Flora Haberkorn (Federal Reserve Board), Charlotte Hoopes (Federal Reserve Board), Shaun Khan (Federal Reserve Board), 28 Jul 2025, Efficacy of AI RAG Tools for Complex Information Extraction and Data Annotation Tasks: A Case Study Using Banks Public Disclosures, https://arxiv.org/abs/2507.21360
- Sergio Rojas-Galeano, 26 Jun 2025, Tool or Trouble? Exploring Student Attitudes Toward AI Coding Assistants, https://arxiv.org/abs/2507.22900
- Yiya Diao, Changhe Li, Sanyou Zeng, Xinye Cai, Wenjian Luo, Shengxiang Yang, and Carlos A. Coello Coello, 30 Jul 2025, Nearest-Better Network for Visualizing and Analyzing Combinatorial Optimization Problems: A Unified Tool, https://arxiv.org/abs/2507.22440
- Hongjin Qian, Zheng Liu, 1 Aug 2025, MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning, https://arxiv.org/abs/2508.00271
- Amur Ghose, Andrew B. Kahng, Sayak Kundu, and Zhiang Wang, 1 Aug 2025, ORFS-agent: Tool-Using Agents for Chip Design Optimization, https://arxiv.org/abs/2506.08332
- Eric Hirsch and Christian Friedrich, 31 Jul 2025, Data-driven tool wear prediction in milling, based on a process-integrated single-sensor approach, https://arxiv.org/abs/2412.19950
- Michelle S. Lam, Fred Hohman, Dominik Moritz, Jeffrey P. Bigham, Kenneth Holstein, Mary Beth Kery, 1 Aug 2025, Policy Maps: Tools for Guiding the Unbounded Space of LLM Behaviors, https://arxiv.org/abs/2409.18203
- Guozhao Mo, Wenliang Zhong, Jiawei Chen, Xuanang Chen, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han, Le Sun, 3 Aug 2025, LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?, https://arxiv.org/abs/2508.01780
- Kanghua Mo, Li Hu, Yucheng Long, Zhihao Li, 4 Aug 2025, Attractive Metadata Attack: Inducing LLM Agents to Invoke Malicious Tools, https://arxiv.org/abs/2508.02110
- Nys Tjade Siegel, James H. Cole, Mohamad Habes, Stefan Haufe, Kerstin Ritter, Marc-Andr\'e Schulz, 4 Aug 2025, Explainable AI Methods for Neuroimaging: Systematic Failures of Common Tools, the Need for Domain-Specific Validation, and a Proposal for Safe Application, https://arxiv.org/abs/2508.02560
- Ashutosh Hathidara, Julien Yu, Sebastian Schreiber, 4 Aug 2025, Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky, https://arxiv.org/abs/2507.03336
- Sunil Kumar, Bowen Zhao, Leo Dirac, Paulina Varshavskaya, 2 Aug 2025, Reinforcing VLMs to Use Tools for Detailed Visual Reasoning Under Resource Constraints, https://arxiv.org/abs/2506.14821
- Peng Ding, Rick Stevens, 5 Aug 2025, Unified Tool Integration for LLMs: A Protocol-Agnostic Approach to Function Calling, https://arxiv.org/abs/2508.02979
- Zikun Cui, Tianyi Huang, Chia-En Chiang, Cuiqianhe Du, 5 Aug 2025, Toward Verifiable Misinformation Detection: A Multi-Tool LLM Agent Framework, https://arxiv.org/abs/2508.03092
- Shaofeng Yin, Ting Lei, Yang Liu, 5 Aug 2025, ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools, https://arxiv.org/abs/2508.03284
- Khaled Bachir Delassi (1), Lakhdar Zeggane (1), Hadda Cherroun (1), Abdelhamid Haouhat (1), Kaoutar Bouzouad (2) ((1) LIM Lab, Amar Telidji University, Laghouat, Algeria, (2) Computer Science Dept., USTHB, Algiers, Algeria), 5 Aug 2025, VQA support to Arabic Language Learning Educational Tool, https://arxiv.org/abs/2508.03488
- Zexiong Ma, Chao Peng, Qunhong Zeng, Pengfei Gao, Yanzhen Zou, Bing Xie, 5 Aug 2025, Tool-integrated Reinforcement Learning for Repo Deep Search, https://arxiv.org/abs/2508.03012
- Peng Ding, 11 Jul 2025, ToolRegistry: A Protocol-Agnostic Tool Management Library for Function-Calling LLMs, https://arxiv.org/abs/2507.10593
- Si Chen, Izzy Molnar, Ting Hua, Peiyu Li, Le Huy Khiem, G. Alex Ambrose, Jim Lang, Ronald Metoyer, Nitesh V. Chawla, 6 Aug 2025, \textsc{SimInstruct}: A Responsible Tool for Collecting Scaffolding Dialogues Between Experts and LLM-Simulated Novices, https://arxiv.org/abs/2508.04428
- Manuela Schuler, 6 Aug 2025, A Visual Tool for Interactive Model Explanation using Sensitivity Analysis, https://arxiv.org/abs/2508.04269
- Zhejun Zhao, Yuehu Dong, Alley Liu, Lixue Zheng, Pingsheng Liu, Dongdong Shen, Long Xia, Jiashu Zhao, Dawei Yin, 6 Aug 2025, TURA: Tool-Augmented Unified Retrieval Agent for AI Search, https://arxiv.org/abs/2508.04604
- Natalia Echeverry and Arun Lekshmi Narayanan, 6 Aug 2025, How are CS students using resources and AI tools for coding tasks?, https://arxiv.org/abs/2508.04667
- Rafael Salinas-Buestan, Otto Parra, Nelly Condori-Fernandez, Maria Fernanda Granda, 22 Jul 2025, Evaluating Generative AI Tools for Personalized Offline Recommendations: A Comparative Study, https://arxiv.org/abs/2508.03710
- Linfeng Gao, Yaoxiang Wang, Minlong Peng, Jialong Tang, Yuzhe Shang, Mingming Sun, Jinsong Su, 7 Aug 2025, Tool Graph Retriever: Exploring Dependency Graph-based Tool Retrieval for Large Language Models, https://arxiv.org/abs/2508.05152
- Hannah-Beth Clark, Laura Benton, Emma Searle, Margaux Dowland, Matthew Gregory, Will Gayne and John Roberts, 7 Aug 2025, Building Effective Safety Guardrails in AI Education Tools, https://arxiv.org/abs/2508.05360
- Sahil Bansal, Sai Shruthi Sistla, Aarti Arikatala, Sebastian Schreiber, 7 Aug 2025, Planning Agents on an Ego-Trip: Leveraging Hybrid Ego-Graph Ensembles for Improved Tool Retrieval in Enterprise Task Planning, https://arxiv.org/abs/2508.05888
- Chandler Campbell, Bernie Boscoe, Tuan Do, 25 Jul 2025, AquiLLM: a RAG Tool for Capturing Tacit Knowledge in Research Groups, https://arxiv.org/abs/2508.05648
- Jiaxuan Liang, Shide Zhou, and Kailong Wang, 26 Jul 2025, OmniBench-RAG: A Multi-Domain Evaluation Platform for Retrieval-Augmented Generation Tools, https://arxiv.org/abs/2508.05650
- Xianghe Pang, Shuo Tang, Rui Ye, Yuwen Du, Yaxin Du, Siheng Chen, 12 Aug 2025, BrowseMaster: Towards Scalable Web Browsing via Tool-Augmented Programmatic Agent Pair, https://arxiv.org/abs/2508.09129
- Junjie Ye, Changhao Jiang, Zhengyin Du, Yufei Xu, Xuesong Yao, Zhiheng Xi, Xiaoran Fan, Qi Zhang, Xuanjing Huang, Jiecao Chen, 12 Aug 2025, Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments, https://arxiv.org/abs/2508.08791
- Jiawei Zhou, Amy Z. Chen, Darshi Shah, Laura M. Schwab Reese, and Munmun De Choudhury, 11 Aug 2025, A Risk Taxonomy and Reflection Tool for Large Language Model Adoption in Public Health, https://arxiv.org/abs/2411.02594
- Yanming Liu, Xinyue Peng, Jiannan Cao, Yuwei Zhang, Xuhong Zhang, Sheng Cheng, Xun Wang, Jianwei Yin, Tianyu Du, 15 Aug 2025, Tool-Planner: Task Planning with Clusters across Multiple Tools, https://arxiv.org/abs/2406.03807
- Wenjie Chen, Wenbin Li, Di Yao, Xuying Meng, Chang Gong, Jingping Bi, 18 Aug 2025, GTool: Graph Enhanced Tool Planning with Large Language Model, https://arxiv.org/abs/2508.12725
- Guangfu Hao, Haojie Wen, Liangxuan Guo, Yang Chen, Yanchao Bi, Shan Yu, 18 Aug 2025, Flexible Tool Selection through Low-dimensional Attribute Alignment of Vision and Language, https://arxiv.org/abs/2505.22146
- Chao Tang, Anxing Xiao, Yuhong Deng, Tianrun Hu, Wenlong Dong, Hanbo Zhang, David Hsu, Hong Zhang, 19 Aug 2025, MimicFunc: Imitating Tool Manipulation from a Single Human Video via Functional Correspondence, https://arxiv.org/abs/2508.13534
- Wenxin Jiang, Mingyu Kim, Chingwo Cheung, Heesoo Kim, George K. Thiruvathukal, James C. Davis, 18 Aug 2025, "I see models being a whole other thing": An Empirical Study of Pre-Trained Model Naming Conventions and A Tool for Enhancing Naming Consistency, https://arxiv.org/abs/2310.01642
- Zhongzhou Chen, 20 Aug 2025, Reliable generation of isomorphic physics problems using ChatGPT with prompt-chaining and tool use, https://arxiv.org/abs/2508.14755
- Lixiang Yan, 20 Aug 2025, From Passive Tool to Socio-cognitive Teammate: A Conceptual Framework for Agentic AI in Human-AI Collaborative Learning, https://arxiv.org/abs/2508.14825
- Hengyu An, Jinghuai Zhang, Tianyu Du, Chunyi Zhou, Qingming Li, Tao Lin, Shouling Ji, 21 Aug 2025, IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents, https://arxiv.org/abs/2508.15310
- Yufeng Zhao, Junnan Liu, Hongwei Liu, Dongsheng Zhu, Yuan Shen, Songyang Zhang, Kai Chen, 21 Aug 2025, Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis, https://arxiv.org/abs/2508.15754
- Zhiqiang Wang, Yichao Gao, Yanting Wang, Suyuan Liu, Haifeng Sun, Haoran Cheng, Guanquan Shi, Haohua Du, Xiangyang Li, 19 Aug 2025, MCPTox: A Benchmark for Tool Poisoning Attack on Real-World MCP Servers, https://arxiv.org/abs/2508.14925
- Vishnou Vinayagame, Gregory Senay, and Luis Mart\'i, 20 Aug 2025, MATATA: Weakly Supervised End-to-End MAthematical Tool-Augmented Reasoning for Tabular Applications, https://arxiv.org/abs/2411.18915
- Fei Lei, Yibo Yang, Wenxiu Sun, Dahua Lin, 22 Aug 2025, MCPVerse: An Expansive, Real-World Benchmark for Agentic Tool Use, https://arxiv.org/abs/2508.16260
- Eduardo de Conto, Blaise Genest, Arvind Easwaran, Nicholas Ng, Shweta Menon, 25 Aug 2025, DesCartes Builder: A Tool to Develop Machine-Learning Based Digital Twins, https://arxiv.org/abs/2508.17988
- Thao Le, Tim Miller, Ruihan Zhang, Liz Sonenberg, Ronal Singh, 25 Aug 2025, Visual Evaluative AI: A Hypothesis-Driven Tool with Concept-Based Explanations and Weight of Evidence, https://arxiv.org/abs/2407.04710
- Bingguang Hao, Maolin Wang, Zengzhuang Xu, Yicheng Chen, Cunyin Peng, Jinjie GU, Chenyi Zhuang, 7 Aug 2025, Exploring Superior Function Calls via Reinforcement Learning, https://arxiv.org/abs/2508.05118
AI Books from Aussie AI
![]() |
The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
Get your copy from Amazon: The Sweetest Lesson |
![]() |
RAG Optimization: Accurate and Efficient LLM Applications:
new book on RAG architectures:
Get your copy from Amazon: RAG Optimization |
![]() |
Generative AI Applications book:
Get your copy from Amazon: Generative AI Applications |
![]() |
Generative AI programming book:
Get your copy from Amazon: Generative AI in C++ |
![]() |
CUDA C++ Optimization book:
Get your copy from Amazon: CUDA C++ Optimization |
![]() |
CUDA C++ Debugging book:
Get your copy from Amazon: CUDA C++ Debugging |
More AI Research
Read more about: