Aussie AI

Function Calling

Last Updated 10 March, 2026

by David Spuler, Ph.D.

Function calling is where LLM architectures access an external module via a function call to retrieve extra data, perform calculations, or trigger an action. It is also known as "tool usage" by the LLM, as distinct from the reverse meaning in the use of LLM tools by AI developers.

Types of function calling include external integrations for features such as:

Data retrieval (e.g., search the internet, search real estate listings database, etc.)
Computation tools (e.g., clocks, calculators, arithmetic)
Action tools (e.g., the LLM calling out to "agents" that can send an email, make a booking, etc.)

Related areas of LLM research include:

Research on Function Calling

Yechen Xu, Xinhao Kong, Tingjun Chen, Danyang Zhuo, 4 Jun 2024 (v2), Conveyor: Efficient Tool-aware LLM Serving with Tool Partial Execution, https://arxiv.org/abs/2406.00059 Code: https://github.com/conveyor-sys/conveyor (Speeding up inference by partially running tools in parallel to the LLM query procesisng, rather than sequentially after the LLM request, by detecting tool requests deep inside the decoding algorithm and starting them off immediately, before the LLM has finished generating the fully decoed output.)
Pan Lu, 2024, Advancing Mathematical Reasoning with Language Models: A Multimodal and Knowledge-Intensive Perspective, Ph.D. Thesis, Computer Science, University of California, Los Angeles, https://escholarship.org/content/qt678864d8/qt678864d8.pdf
Junzhi Chen, Juhao Liang, Benyou Wang, 9 May 2024, Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning, https://arxiv.org/abs/2405.05955
Jonas Wallat, Adam Jatowt, Avishek Anand, March 2024, Temporal Blind Spots in Large Language Models, WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data Mining, Pages 683–692, https://arxiv.org/abs/2401.12078, https://doi.org/10.1145/3616855.3635818, https://dl.acm.org/doi/abs/10.1145/3616855.3635818
Nate Kushman, Yoav Artzi, Luke Zettlemoyer, Regina Barzilay, June 2014, Learning to Automatically Solve Algebra Word Problems, P14-1026 Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), https://aclanthology.org/P14-1026/ PDF: https://aclanthology.org/P14-1026.pdf
Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning, https://proceedings.neurips.cc/paper_files/paper/2023/file/4a47dd69242d5af908cdd5d51c971cbf-Paper-Datasets_and_Benchmarks.pdf
Subhro Roy, Dan Roth, 20 Aug 2016 (v2), Solving General Arithmetic Word Problems, https://arxiv.org/abs/1608.01413
Subhro Roy, Shyam Upadhyay, Dan Roth, 28 Sep 2016, Equation Parsing: Mapping Sentences to Grounded Equations, https://arxiv.org/abs/1609.08824
Yan Wang, Xiaojiang Liu, Shuming Shi, September 2017, Deep Neural Solver for Math Word Problems D17-1088, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing Copenhagen, Denmark, https://aclanthology.org/D17-1088/ PDF: https://aclanthology.org/D17-1088.pdf
reiinakano, November 12, 2019, Teaching a neural network to use a calculator, https://reiinakano.com/2019/11/12/solving-probability.html (Integrate SymPy calculator into the results of a neural network, by looking for the '=' sign.)
Guoxin Chen, Minpeng Liao, Chengxi Li, Kai Fan, 6 May 2024, AlphaMath Almost Zero: process Supervision without process, https://arxiv.org/abs/2405.03553 https://github.com/MARIO-Math-Reasoning/Super_MARIO
Sirui Hong, Yizhang Lin, Bang Liu, Bangbang Liu, Binhao Wu, Danyang Li, Jiaqi Chen, Jiayi Zhang, Jinlin Wang, Li Zhang, Lingyao Zhang, Min Yang, Mingchen Zhuge, Taicheng Guo, Tuo Zhou, Wei Tao, Wenyi Wang, Xiangru Tang, Xiangtao Lu, Xiawu Zheng, Xinbing Liang, Yaying Fei, Yuheng Cheng, Zongze Xu, Chenglin Wu, 12 Mar 2024 (v3), Data Interpreter: An LLM Agent For Data Science, https://arxiv.org/abs/2402.18679 Code: https://github.com/geekan/MetaGPT
Zelong Li, Wenyue Hua, Hao Wang, He Zhu, Yongfeng Zhang, 4 Feb 2024 (v2), Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents, https://arxiv.org/abs/2402.00798 Code: https://github.com/agiresearch/Formal-LLM
Qiusi Zhan, Zhixiang Liang, Zifan Ying, Daniel Kang, 25 Mar 2024 (v2), InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents, https://arxiv.org/abs/2403.02691
Wenhu Chen, Xueguang Ma, Xinyi Wang, and William W Cohen. Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. arXiv preprint arXiv:2211.12588, 2022. https://arxiv.org/abs/2211.12588 (Integrate a Python interpreter to execute the code generated by the LLM to answer the query.)
Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, Pengfei Liu, Yiming Yang, Jamie Callan, and Graham Neubig. Pal: Program-aided language models. In International Conference on Machine Learning, pages 10764–10799. PMLR, 2023. https://arxiv.org/abs/2211.10435 Code: http://reasonwithpal.com/ (Python interpreter integrated as a tool for LLMs.)
Intel, April 2024, Intel® Compiler First to Achieve SYCL* 2020 Conformance, https://www.intel.com/content/www/us/en/developer/articles/technical/compiler-first-full-sycl2020-conformance.html
Long Hei Matthew Lam, Ehsan Shareghi, 1 Jun 2024, A Closer Look at Logical Reasoning with LLMs: The Choice of Tool Matters, https://arxiv.org/abs/2406.00284 (Using symbolic solvers with LLMs.)
M Keber, I Grubišic, A Barešic, A Jovic, 2024, A Review on Neuro-symbolic AI Improvements to Natural Language Processing, https://www.researchgate.net/profile/Alan-Jovic/publication/380911364_A_Review_on_Neuro-symbolic_AI_Improvements_to_Natural_Language_Processing/links/6655c0ec22a7f16b4f51fb2f/A-Review-on-Neuro-symbolic-AI-Improvements-to-Natural-Language-Processing.pdf
Yubo Ma, Zhibin Gou, Junheng Hao, Ruochen Xu, Shuohang Wang, Liangming Pan, Yujiu Yang, Yixin Cao, Aixin Sun, Hany Awadalla, Weizhu Chen, 21 Feb 2024 (v2), SciAgent: Tool-augmented Language Models for Scientific Reasoning, https://arxiv.org/abs/2402.11451
Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu, 2023, ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings, Part of Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Main Conference Track, https://proceedings.neurips.cc/paper_files/paper/2023/hash/8fd1a81c882cd45f64958da6284f4a3f-Abstract-Conference.html
Mengkang Hu, Yao Mu, Xinmiao Yu, Mingyu Ding, Shiguang Wu, Wenqi Shao, Qiguang Chen, Bin Wang, Yu Qiao, and Ping Luo. 2023a. Tree-planner: Efficient close-loop task planning with large language models. arXiv preprint arXiv:2310.08582. https://arxiv.org/abs/2310.08582
Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik R Narasimhan, and Shunyu Yao. 2023. Reflexion: Language agents with verbal reinforcement learning. In Thirty-seventh Conference on Neural Information Processing Systems. https://arxiv.org/abs/2303.11366
Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, et al. 2023b. ToolLLM: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789. https://arxiv.org/abs/2307.16789
Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, XuChen, Yankai Lin, et al. 2023c. A survey on large language model based autonomous agents. arXiv preprint arXiv:2308.11432. https://arxiv.org/abs/2308.11432
Aaron Parisi, Yao Zhao, and Noah Fiedel. Talm: Tool augmented language models. arXiv preprint arXiv:2205.12255, 2022. https://arxiv.org/abs/2205.12255
Joy He-Yueya, Gabriel Poesia, Rose E. Wang, and Noah D. Goodman. Solving math word problems by combining language models with symbolic solvers. ArXiv, abs/2304.09102, 2023. https://arxiv.org/abs/2304.09102
Shima Imani, Liang Du, and H. Shrivastava. Mathprompter: Mathematical reasoning using large language models. ArXiv, abs/2303.05398, 2023. https://arxiv.org/abs/2303.05398
Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhijun Tu, Kai Han, Hailin Hu, Dacheng Tao, 5 Feb 2024. A Survey on Transformer Compression. https://arxiv.org/abs/2402.05964 (Model compression survey paper with focus on pruning, quantization, knowledge distillation, and efficient architecture design.)
Simranjit Singh, Andreas Karatzas, Michael Fore, Iraklis Anagnostopoulos, Dimitrios Stamoulis, 7 May 2024, An LLM-Tool Compiler for Fused Parallel Function Calling, https://arxiv.org/abs/2405.17438
Julian Yip, Apr 2, 2024, Build Autonomous AI Agents with Function Calling: Transform your chatbot into an agent that can interact with external APIs, https://towardsdatascience.com/build-autonomous-ai-agents-with-function-calling-0bb483753975 (Implement agents via models that output a JSON object that describes the API to call and the parmaeters to send.)
Adva Nakash Peleg, May 30, 2024, An LLM Journey: From POC to Production, https://medium.com/cyberark-engineering/an-llm-journey-from-poc-to-production-6c5ec6a172fb
Yu Gu, Yiheng Shu, Hao Yu, Xiao Liu, Yuxiao Dong, Jie Tang, Jayanth Srinivasa, Hugo Latapie, Yu Su, 22 Feb 2024, Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments, https://arxiv.org/abs/2402.14672
Yaobo Liang, Chenfei Wu , Ting Song , Wenshan Wu , Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan, March 2023, TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs, https://arxiv.org/pdf/2303.16434.pdf
kipply's blog, 2023-03-30, Transformer Taxonomy (the last lit review), https://kipp.ly/transformer-taxonomy/ (Papers for all the Transformer architectures and milestone papers for the major optimization improvements on them.)
Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, John Schulman, 1 Jun 2022 (v3), WebGPT: Browser-assisted question-answering with human feedback, https://arxiv.org/abs/2112.09332
Tianlin Shi, Andrej Karpathy, Linxi Fan, Jonathan Hernandez, Percy Liang, 2017, World of Bits: An Open-Domain Platform for Web-Based Agents, Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3135-3144, https://proceedings.mlr.press/v70/shi17a.html
Peter C Humphreys, David Raposo, Toby Pohlen, Gregory Thornton, Rachita Chhaparia, Alistair Muldal, Josh Abramson, Petko Georgiev, Alex Goldin, Adam Santoro, Timothy Lillicrap, 11 Nov 2022 (v2), A data-driven approach for learning to control computers, https://arxiv.org/abs/2202.08137
Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom, 9 Feb 2023, Toolformer: Language Models Can Teach Themselves to Use Tools, https://arxiv.org/abs/2302.04761
OpenAI, 2024, Function calling, https://platform.openai.com/docs/guides/function-calling
Cobus Greyling, June 16, 2023, Practical Examples of OpenAI Function Calling, https://cobusgreyling.medium.com/practical-examples-of-openai-function-calling-a6419dc38775
University of California, Berkeley, 2024, Berkeley Function-Calling Leaderboard, https://gorilla.cs.berkeley.edu/leaderboard.html https://huggingface.co/datasets/gorilla-llm/Berkeley-Function-Calling-Leaderboard
Wes Brewer, Ana Gainaru, Frédéric Suter, Feiyi Wang, Murali Emani, Shantenu Jha, 20 Jun 2024, AI-coupled HPC Workflow Applications, Middleware and Performance, (Examines integrations of various workflows into LLMs.) https://arxiv.org/abs/2406.14315
Aarushi Kansal, Chapter 3: Chains, Tools and Agents Building Generative AI-Powered Apps: A Hands-on Guide for Developers, Apress, https://www.amazon.com/Building-Generative-AI-Powered-Apps-Hands-ebook/dp/B0CTXXP1S4/
Vishal Rajput, Apr 11, 2024, What’s next for AI: AI agentic workflows? https://medium.com/aiguys/next-for-llms-and-rag-ai-agentic-workflows-1869ba0a6796
Shishir Patil, May 10, 2024, Teaching Large Language Models to Use Tools at Scale, Ph.D. Thesis, Electrical Engineering and Computer Sciences, University of California, Berkeley, Technical Report No. UCB/EECS-2024-85, http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-85.html https://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-85.pdf
Xi Wang, Procheta Sen, Ruizhe Li, Emine Yilmaz, 31 Jul 2024, Adaptive Retrieval-Augmented Generation for Conversational Systems, https://arxiv.org/abs/2407.21712 (Deciding whether or not to include a RAG external data request in the inference of a chatbot in a multi-turn conversation.)
Michael Nuñez, July 18, 2024, Groq’s open-source Llama AI model tops leaderboard, outperforming GPT-4o and Claude in function calling, https://venturebeat.com/ai/groq-open-source-llama-ai-model-tops-leaderboard-outperforming-gpt-4o-and-claude-in-function-calling/
Thomas Reid, Jul 31, 2024, Ollama’s Latest Update: Tool Use: Everything you need to know about function calling in Ollama https://ai.gopubby.com/ollamas-latest-update-tool-use-7b809e15be5c
Jiarui Lu, Thomas Holleis, Yizhe Zhang, Bernhard Aumayer, Feng Nan, Felix Bai, Shuang Ma, Shen Ma, Mengyu Li, Guoli Yin, Zirui Wang, Ruoming Pang, 8 Aug 2024, ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities, https://arxiv.org/abs/2408.04682 Code: https://github.com/apple/ToolSandbox
Reyna Abhyankar, Zijian He, Vikranth Srivatsa, Hao Zhang, Yiying Zhang, July 2024, InferCept: Efficient Intercept Support for Augmented Large Language Model Inference, Proceedings of the 41st International Conference on Machine Learning, PMLR 235:81-95, 2024, https://proceedings.mlr.press/v235/abhyankar24a.html PDF: https://raw.githubusercontent.com/mlresearch/v235/main/assets/abhyankar24a/abhyankar24a.pdf
Yu Du, Fangyun Wei, Hongyang Zhang, July 2024, AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls, Proceedings of the 41st International Conference on Machine Learning, PMLR 235:11812-11829, 2024, https://proceedings.mlr.press/v235/du24h.html PDF: https://raw.githubusercontent.com/mlresearch/v235/main/assets/du24h/du24h.pdf
MemGPT, Aug 2024, Adding custom tools to MemGPT, https://memgpt.readme.io/docs/adding-custom-tools-to-memgpt
Asim Biswal, Liana Patel, Siddarth Jha, Amog Kamsetty, Shu Liu, Joseph E. Gonzalez, Carlos Guestrin, Matei Zaharia, 27 Aug 2024, Text2SQL is Not Enough: Unifying AI and Databases with TAG, https://arxiv.org/abs/2408.14717 https://github.com/TAG-Research/TAG-Bench
Yaroslav Zharov, Yury Khudyakov, Evgeniia Fedotova, Evgeny Grigorenko, Egor Bogomolov, 18 Feb 2024, Tool-Augmented LLMs as a Universal Interface for IDEs, https://arxiv.org/abs/2402.11635
Lutfi Eren Erdogan, Nicholas Lee, Siddharth Jha, Sehoon Kim, Ryan Tabrizi, Suhong Moon, Coleman Hooper, Gopala Anumanchipalli, Kurt Keutzer, Amir Gholami, 1 Sep 2024, TinyAgent: Function Calling at the Edge, https://arxiv.org/abs/2409.00608 https://github.com/SqueezeAILab/TinyAgent
Suhong Moon, Siddharth Jha, Lutfi Eren Erdogan, Sehoon Kim, Woosang Lim, Kurt Keutzer, Amir Gholami, 2 Sep 2024, Efficient and Scalable Estimation of Tool Representations in Vector Space, https://arxiv.org/abs/2409.02141 https://github.com/SqueezeAILab/Tool2Vec (Using synthetic data to train tool usage decision models.)
Xiaoxia Liu, Jingyi Wang, Jun Sun, Xiaohan Yuan, Guoliang Dong, Peng Di, Wenhai Wang, Dongxia Wang, 21 Nov 2023, Prompting Frameworks for Large Language Models: A Survey, https://arxiv.org/abs/2311.12785
Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Samrat Mondal, Aman Chadha, 5 Feb 2024, A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications, https://arxiv.org/abs/2402.07927
Yupu Hao, Pengfei Cao, Zhuoran Jin, Huanxuan Liao, Yubo Chen, Kang Liu, Jun Zhao, 23 Sep 2024 (v2), CITI: Enhancing Tool Utilizing Ability in Large Language Models without Sacrificing General Performance, https://arxiv.org/abs/2409.13202
Carl Franzen, September 27, Cohere updates APIs to make it easier for devs to switch from other models, https://venturebeat.com/ai/cohere-updates-apis-to-make-it-easier-for-devs-to-switch-from-other-models/
Renxi Wang, Xudong Han, Lei Ji, Shu Wang, Timothy Baldwin, Haonan Li, 8 Oct 2024 (v2), ToolGen: Unified Tool Retrieval and Calling via Generation, https://arxiv.org/abs/2410.03439
Ke Wang, Jiahui Zhu, Minjie Ren, Zeming Liu, Shiwei Li, Zongye Zhang, Chenkai Zhang, Xiaoyu Wu, Qiqi Zhan, Qingjie Liu, Yunhong Wang, 16 Oct 2024, A Survey on Data Synthesis and Augmentation for Large Language Models, https://arxiv.org/abs/2410.12896
Yakun Zhu, Shaohang Wei, Xu Wang, Kui Xue, Xiaofan Zhang, Shaoting Zhang, 17 Oct 2024, MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling, https://arxiv.org/abs/2410.13610
Elias Lumer, Vamse Kumar Subbiah, James A. Burke, Pradeep Honaganahalli Basavaraju, Austin Huber, 22 Oct 2024 (v2), Toolshed: Scale Tool-Equipped Agents with Advanced RAG-Tool Fusion and Tool Knowledge Bases, https://arxiv.org/abs/2410.14594
A. Singh, A. Ehtesham, S. Kumar and T. T. Khoei, "Enhancing AI Systems with Agentic Workflows Patterns in Large Language Model," 2024 IEEE World AI IoT Congress (AIIoT), Seattle, WA, USA, 2024, pp. 527-532, doi: 10.1109/AIIoT61789.2024.10578990. https://ieeexplore.ieee.org/abstract/document/10578990
Chawla, Chhavi; Chatterjee, Siddharth; Gadadinni, Sanketh Siddanna; Verma, Pulkit; Banerjee, Sourav, 2024, Agentic AI: The building blocks of sophisticated AI business applications, Journal of AI, Robotics & Workplace Automation, Volume 3 / Number 3 / Summer 2024, pp. 1-15(15), Henry Stewart Publications, DOI: https://doi.org/10.69554/XEHZ1946 https://www.ingentaconnect.com/content/hsp/airwa/2024/00000003/00000003/art00001
Dawei Gao, Zitao Li, Xuchen Pan, Weirui Kuang, Zhijian Ma, Bingchen Qian, Fei Wei, Wenhao Zhang, Yuexiang Xie, Daoyuan Chen, Liuyi Yao, Hongyi Peng, Zeyu Zhang, Lin Zhu, Chen Cheng, Hongzhu Shi, Yaliang Li, Bolin Ding, Jingren Zhou, 20 May 2024 (v2), AgentScope: A Flexible yet Robust Multi-Agent Platform, https://arxiv.org/abs/2402.14034 https://github.com/modelscope/agentscope
Michael Nuñez, November 4, 2024, UC San Diego, Tsinghua University researchers just made AI way better at knowing when to ask for help, https://venturebeat.com/ai/uc-san-diego-tsinghua-university-researchers-just-made-ai-way-better-at-knowing-when-to-ask-for-help/
Jerry Huang, Prasanna Parthasarathi, Mehdi Rezagholizadeh, Sarath Chandar, 14 Apr 2024, Towards Practical Tool Usage for Continually Learning LLMs, https://arxiv.org/abs/2404.09339
Amy Marks, Jun 11, 2024, Clarifying Function Calling / Tool Use in LLMs, https://medium.com/@aevalone/clarifying-function-calling-tool-use-in-llms-6511af510f99
Bohan Lyu, Yadi Cao, Duncan Watson-Parris, Leon Bergen, Taylor Berg-Kirkpatrick, Rose Yu, 1 Nov 2024, Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation, https://arxiv.org/abs/2411.00412
Anthropic, 26 Nov 2024, Introducing the Model Context Protocol, https://www.anthropic.com/news/model-context-protocol
Varatheepan Paramanayakam, Andreas Karatzas, Iraklis Anagnostopoulos, Dimitrios Stamoulis, 23 Nov 2024, Less is More: Optimizing Function Calling for LLM Execution on Edge Devices, https://arxiv.org/abs/2411.15399
Soh, J., Singh, P. (2024). Semantic Kernel, Plugins, and Function Calling. In: Data Science Solutions on Azure. Apress, Berkeley, CA. https://doi.org/10.1007/979-8-8688-0914-9_7 https://link.springer.com/chapter/10.1007/979-8-8688-0914-9_7
Chris Sypherd, Vaishak Belle, 5 Dec 2024, Practical Considerations for Agentic LLM Systems, https://arxiv.org/abs/2412.04093
Zhi-Yuan Chen, Shiqi Shen, Guangyao Shen, Gong Zhi, Xu Chen, and Yankai Lin. 2024. Towards Tool Use Alignment of Large Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 1382–1400, Miami, Florida, USA. Association for Computational Linguistics. https://aclanthology.org/2024.emnlp-main.82/ https://aclanthology.org/2024.emnlp-main.82.pdf
Damien de Mijolla, Wen Yang, Philippa Duckett, Christopher Frye, Mark Worrall, 8 Dec 2024, Language hooks: a modular framework for augmenting LLM reasoning that decouples tool usage from the model and its prompt, https://arxiv.org/abs/2412.05967
In Gim, Seung-seob Lee, Lin Zhong, 9 Dec 2024, Asynchronous LLM Function Calling, https://arxiv.org/abs/2412.07017 (Overlap LLM computations and tool execution.)
Outlore, Dec 14, 2024, Reflections on building with Model Context Protocol (MCP), https://outlore.dev/blog/model-context-protocol/
Andrew Zuo, Dec 13, 2024, AI Assistants Are Going To Get Really Good, https://andrewzuo.com/ai-assistants-are-going-to-get-really-good-d6e6a026e588
Wenchao Xu, Jinyu Chen, Peirong Zheng, Xiaoquan Yi, Tianyi Tian, Wenhui Zhu, Quan Wan, Haozhao Wang, Yunfeng Fan, Qinliang Su, Xuemin Shen, https://arxiv.org/abs/2412.13437 18 Dec 2024, Deploying Foundation Model Powered Agent Services: A Survey, (A survey of not just deployment, but many inference optimization techniques.)
Qwen: An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, Huan Lin, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jingren Zhou, Junyang Lin, Kai Dang, Keming Lu, Keqin Bao, Kexin Yang, Le Yu, Mei Li, Mingfeng Xue, Pei Zhang, Qin Zhu, Rui Men, Runji Lin, Tianhao Li, Tingyu Xia, Xingzhang Ren, Xuancheng Ren, Yang Fan, Yang Su, Yichang Zhang, Yu Wan, Yuqiong Liu, Zeyu Cui, Zhenru Zhang, Zihan Qiu (additional authors not shown), 19 Dec 2024, Qwen2.5 Technical Report, https://arxiv.org/abs/2412.15115
Dian Yu, Yuheng Zhang, Jiahao Xu, Tian Liang, Linfeng Song, Zhaopeng Tu, Haitao Mi, Dong Yu, 22 Dec 2024, Teaching LLMs to Refine with Tools, https://arxiv.org/abs/2412.16871
Xiangjue Dong, Maria Teleki, James Caverlee, 18 Dec 2024, A Survey on LLM Inference-Time Self-Improvement, https://arxiv.org/abs/2412.14352 https://github.com/dongxiangjue/Awesome-LLM-Self-Improvement (Broad survey of reasoning improvement methods from multi-step inference to RALM to decoding algorithms.)
Florian Dietz, Dietrich Klakow, 1 Jan 2025, IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently, https://arxiv.org/abs/2501.00684
Violet Xiang, Charlie Snell, Kanishk Gandhi, Alon Albalak, Anikait Singh, Chase Blagden, Duy Phung, Rafael Rafailov, Nathan Lile, Dakota Mahan, Louis Castricato, Jan-Philipp Franken, Nick Haber, Chelsea Finn, 8 Jan 2025, Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought, https://arxiv.org/abs/2501.04682
Julia Wiesinger, Patrick Marlow and Vladimir Vuskovic, Sep 2024, Agents, Google Whitepaper, https://www.kaggle.com/whitepaper-agents
S. Song et al., 2025, "How to Bridge the Gap between Modalities: Survey on Multimodal Large Language Model," in IEEE Transactions on Knowledge and Data Engineering, doi: 10.1109/TKDE.2025.3527978. https://ieeexplore.ieee.org/abstract/document/10841938/
Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen, 21 Feb 2024 (v4), ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving, https://arxiv.org/abs/2309.17452
Bohan Lyu, Xin Cong, Heyang Yu, Pan Yang, Yujia Qin, Yining Ye, Yaxi Lu, Zhong Zhang, Yukun Yan, Yankai Lin, Zhiyuan Liu, Maosong Sun, 28 Dec 2023, GitAgent: Facilitating Autonomous Agent with GitHub by Tool Extension, https://arxiv.org/abs/2312.17294
Tong Xiao, Jingbo Zhu, 16 Jan 2025, Foundations of Large Language Models, https://arxiv.org/abs/2501.09223 (Huge 230 page paper on many topics such as training, prompting, alignment, and long context.)
Xinzhe Li, Jan 2025, A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning, Proceedings of the 31st International Conference on Computational Linguistics, pages 9760–9779, January 19–24, 2025. ©2025 Association for Computational Linguistics, https://aclanthology.org/2025.coling-main.652.pdf https://github.com/xinzhel/LLM-Agent-Survey
Connor Shorten, Charles Pierse, Thomas Benjamin Smith, Karel D'Oosterlinck, Tuana Celik, Erika Cardenas, Leonie Monigatti, Mohd Shukri Hasan, Edward Schmuhl, Daniel Williams, Aravind Kesiraju, Bob van Luijt, 23 Jan 2025, Querying Databases with Function Calling, https://arxiv.org/abs/2502.00032
Jiali Cheng, Hadi Amiri, 3 Feb 2025. Tool Unlearning for Tool-Augmented LLMs, https://arxiv.org/abs/2502.01083 (Unlearning theory applied to tool usage.)
Wenjun Li, Dexun Li, Kuicai Dong, Cong Zhang, Hao Zhang, Weiwen Liu, Yasheng Wang, Ruiming Tang, Yong Liu, 18 Feb 2025, Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger, https://arxiv.org/abs/2502.12961 (Examining the decision whether or not to launch a tool, and the inefficiency of non-needed tool calls.)
C Winston, R Just, Feb 2025, A Taxonomy of Failures in Tool-Augmented LLMs, https://homes.cs.washington.edu/~rjust/publ/tallm_testing_ast_2025.pdf
Xuan Zhang, Yongliang Shen, Zhe Zheng, Linjuan Wu, Wenqi Zhang, Yuchen Yan, Qiuying Peng, Jun Wang, Weiming Lu, 3 Mar 2025, AskToAct: Enhancing LLMs Tool Use via Self-Correcting Clarification, https://arxiv.org/abs/2503.01940
Hongshen Xu, Zihan Wang, Zichen Zhu, Lei Pan, Xingyu Chen, Lu Chen, Kai Yu, 9 Mar 2025, Alignment for Efficient Tool Calling of Large Language Models, https://arxiv.org/abs/2503.06708
Anthropic, 14 Mar 2025, Token-saving updates on the Anthropic API, https://www.anthropic.com/news/token-saving-updates (Prompt caching, excluding cached responses from rate limits, and token-efficient tool calling.)
Mengsong Wu, Tong Zhu, Han Han, Xiang Zhang, Wenbiao Shao, Wenliang Chen, 21 Mar 2025, Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models, https://arxiv.org/abs/2503.16779 https://github.com/fairyshine/Chain-of-Tools
Ali Forootani, 22 Mar 2025, A Survey on Mathematical Reasoning and Optimization with Large Language Models, https://arxiv.org/abs/2503.17726
Aiyao He, Sijia Cui, Shuai Xu, Yanna Wang, Bo Xu, 13 May 2025, TUMS: Enhancing Tool-use Abilities of LLMs with Multi-structure Handlers, https://arxiv.org/abs/2505.08402
Xu Huang, Yuefeng Huang, Weiwen Liu, Xingshan Zeng, Yasheng Wang, Ruiming Tang, Hong Xie, Defu Lian, 7 May 2025, Advancing and Benchmarking Personalized Tool Invocation for LLMs, https://arxiv.org/abs/2505.04072 https://github.com/hyfshadow/PTBench
Wang et. al., 2025, Function Calling in Large Language Models: Industrial Practices, Challenges, and Future Directions, https://openreview.net/pdf?id=LNxVGPedFW
Cameron R. Wolfe, Ph.D., Jun 09, 2025, AI Agents from First Principles: Understanding AI agents by building upon the most basic concepts of LLMs, https://cameronrwolfe.substack.com/p/ai-agents
Beong-woo Kwak, Minju Kim, Dongha Lim, Hyungjoo Chae, Dongjin Kang, Sunghwan Kim, Dongil Yang, Jinyoung Yeo, 29 May 2025, ToolHaystack: Stress-Testing Tool-Augmented Language Models in Realistic Long-Term Interactions, https://arxiv.org/abs/2505.23662 https://github.com/bwookwak/ToolHaystack
Dex Horthy, June 2025 (accessed), 12-Factor Agents - Principles for building reliable LLM applications, https://github.com/humanlayer/12-factor-agents?tab=readme-ov-file
Bohan Yao, Vikas Yadav, 25 Jul 2025, A Toolbox, Not a Hammer -- Multi-TAG: Scaling Math Reasoning with Multi-Tool Aggregation, https://arxiv.org/abs/2507.18973 (Launch multiple tools and aggregate the results)
Bin Wu, Edgar Meij, Emine Yilmaz, Aug 2025, AJoint Optimization Framework for Enhancing Efficiency of Tool Utilization in LLM Agents, Findings of the Association for Computational Linguistics: ACL 2025, pages 22361–22373 July 27- August 1, 2025, https://aclanthology.org/2025.findings-acl.1149.pdf
Zijing Zhang, Zhanpeng Chen, He Zhu, Ziyang Chen, Nan Du, Xiaolong Li, Aug 2025, ToolExpNet: Optimizing Multi-Tool Selection in LLMs with Similarity and Dependency-Aware Experience Networks, Findings of the Association for Computational Linguistics: ACL 2025, pages 15706–15722 July 27- August 1, 2025, https://aclanthology.org/2025.findings-acl.811.pdf
Lingrui Mei, Jiayu Yao, Yuyao Ge, Yiwei Wang, Baolong Bi, Yujun Cai, Jiazhi Liu, Mingyu Li, Zhong-Zhi Li, Duzhen Zhang, Chenlin Zhou, Jiayi Mao, Tianze Xia, Jiafeng Guo, Shenghua Liu, 21 Jul 2025 (v2), A Survey of Context Engineering for Large Language Models, https://arxiv.org/abs/2507.13334
Xu, W., Huang, C., Gao, S. et al. LLM-Based Agents for Tool Learning: A Survey. Data Sci. Eng. (2025). https://doi.org/10.1007/s41019-025-00296-9 https://link.springer.com/article/10.1007/s41019-025-00296-9
Yan Jiang, Hao Zhou, LiZhong GU, Ai Han, TianLong Li, 24 Jun 2025, NaviAgent: Bilevel Planning on Tool Dependency Graphs for Function Calling, https://arxiv.org/abs/2506.19500
Xiaoyu Tan, Bin Li, Xihe Qiu, Chao Qu, Wei Chu, Yinghui Xu, and Yuan Qi. 2025. Meta-Agent-Workflow: Streamlining Tool Usage in LLMs through Workflow Construction, Retrieval, and Refinement. In Companion Proceedings of the ACM on Web Conference 2025 (WWW '25). Association for Computing Machinery, New York, NY, USA, 458–467. https://doi.org/10.1145/3701716.3715247 https://dl.acm.org/doi/abs/10.1145/3701716.3715247
Sebastian Nicolas Müller, May 23, 2025, Infinite tool use, https://snimu.github.io/2025/05/23/infinite-tool-use.html
J Vigel, R Cai, ECA Neema, A Liao, K Zhu, S O'Brien, 2025, Self Knowledge-Tracing for Tool Use (SKT-Tool): Helping LLM Agents Understand Their Capabilities in Tool Use, NAACL2025, The 5th Workshop on Insights from Negative Results in NLP, https://aclanthology.org/anthology-files/anthology-files/pdf/insights/2025.insights-1.pdf#page=155
Cheng Qian, Emre Can Acikgoz, Qi He, Hongru Wang, Xiusi Chen, Dilek Hakkani-Tür, Gokhan Tur, Heng Ji, 16 Apr 2025, ToolRL: Reward is All Tool Learning Needs, https://arxiv.org/abs/2504.13958
Geoffrey Huntley AGENT.md: The Universal Agent Configuration File, July 2025 Request for Comments, https://ampcode.com/AGENT.md
Ningning Wang, Xavier Hu, Pai Liu, He Zhu, Yue Hou, Heyuan Huang, Shengyu Zhang, Jian Yang, Jiaheng Liu, Ge Zhang, Changwang Zhang, Jun Wang, Yuchen Eleanor Jiang, Wangchunshu Zhou, 24 Jul 2025, Efficient Agents: Building Effective Agents While Reducing Cost, https://arxiv.org/pdf/2508.02694 https://github.com/OPPO-PersonalAI/OAgents
Peter Wildeford, Aug 08, 2025, GPT-5: a small step for intelligence, a giant leap for normal people: GPT-5 focuses on where the money is - everyday users, not AI elites, https://peterwildeford.substack.com/p/gpt-5-a-small-step-for-intelligence
Anoop Kotha, Julian Lee, Eric Zakariasson, Anoop Kotha, Julian Lee, OpenAI, Aug 2025, GPT-5 prompting guide, https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide
Will Fein, Ryan J. Horwitz, John E. Brown III, Amit Misra, Felipe Oviedo, Kevin White, Juan M. Lavista Ferres, Samuel K. Wasser, 13 Aug 2025, AI-Driven Detection and Analysis of Handwriting on Seized Ivory: A Tool to Uncover Criminal Networks in the Illicit Wildlife Trade, https://arxiv.org/abs/2508.10219
Muhammad Ahmad, Fida Ullah, Muhammad Usman, Ildar Batyrshin, Grigori Sidorov, 12 Aug 2025, SABIA: An AI-Powered Tool for Detecting Opioid-Related Behaviors on Social Media, https://arxiv.org/abs/2508.10046
Xingshan Zeng, Weiwen Liu, Xu Huang, Zezhong Wang, Lingzhi Wang, Liangyou Li, Yasheng Wang, Lifeng Shang, Xin Jiang, Ruiming Tang, Qun Liu, 14 Aug 2025, ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning, https://arxiv.org/abs/2504.01400
Athanasios Davvetas, Xenia Ziouvelou, Ypatia Dami, Alexis Kaponis, Konstantina Giouvanopoulou, Michael Papademas, 23 Jul 2025, TAI Scan Tool: A RAG-Based Tool With Minimalistic Input for Trustworthy AI Self-Assessment, https://arxiv.org/abs/2507.17514
Zhao Song, Song Yue, Jiahao Zhang, 23 Jul 2025, Thinking Isn't an Illusion: Overcoming the Limitations of Reasoning Models via Tool Augmentations, https://arxiv.org/abs/2507.17699
Arduin Findeis, Floris Weers, Guoli Yin, Ke Ye, Ruoming Pang, Tom Gunter, 22 Jul 2025, Can External Validation Tools Improve Annotation Quality for LLM-as-a-Judge?, https://arxiv.org/abs/2507.17015
Po-Yen Wu, Cheng-Yu Kuo, Yuki Kadokawa, and Takamitsu Matsubara, 23 Jul 2025, Prolonging Tool Life: Learning Skillful Use of General-purpose Tools through Lifespan-guided Reinforcement Learning, https://arxiv.org/abs/2507.17275
Xiaoyi Zhang, Zhaoyang Jia, Zongyu Guo, Jiahao Li, Bin Li, Houqiang Li, Yan Lu, 23 Jul 2025, Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding, https://arxiv.org/abs/2505.18079
Timothy Tin Long Yu, Mahdi Mostajabdaveh, Jabo Serge Byusa, Rindra Ramamonjison, Giuseppe Carenini, Kun Mao, Zirui Zhou, Yong Zhang, 23 Jul 2025, SMARTAPS: Tool-augmented LLMs for Operations Management, https://arxiv.org/abs/2507.17927
Alex Liu, Lief Esbenshade, Shawon Sarkar, Victor Tian, Zachary Zhang, Kevin He, Min Sun, 23 Jul 2025, Decoding Instructional Dialogue: Human-AI Collaborative Analysis of Teacher Use of AI Tool at Scale, https://arxiv.org/abs/2507.17985
Haozhe Wang, Long Li, Chao Qu, Fengming Zhu, Weidi Xu, Wei Chu, Fangzhen Lin, 18 Jul 2025, To Code or not to Code? Adaptive Tool Integration for Math Language Models via Expectation-Maximization, https://arxiv.org/abs/2502.00691
Jiale Liu, Huan Wang, Yue Zhang, Xiaoyu Luo, Jiaxiang Hu, Zhiliang Liu, Min Xie, 20 Jul 2025, InsightX Agent: An LMM-based Agentic Framework with Integrated Tools for Reliable X-ray NDT Analysis, https://arxiv.org/abs/2507.14899
Richard M. Charles, James H. Curry and Richard B. Charles, 15 Jul 2025, Mitigating Trojanized Prompt Chains in Educational LLM Use Cases: Experimental Findings and Detection Tool Design, https://arxiv.org/abs/2507.14207
Qian Xiong and Yuekai Huang and Ziyou Jiang and Zhiyuan Chang and Yujia Zheng and Tianhao Li and Mingyang Li, 21 Jul 2025, Butterfly Effects in Toolchains: A Comprehensive Analysis of Failed Parameter Filling in LLM Tool-Agent Systems, https://arxiv.org/abs/2507.15296
Jubin Abhishek Soni, Amit Anand, Rajesh Kumar Pandey, Aniket Abhishek Soni, 19 Jul 2025, Dynamic Context Tuning for Retrieval-Augmented Generation: Enhancing Multi-Turn Planning and Tool Adaptation, https://arxiv.org/abs/2506.11092
Shiqing Fan, Xichen Ding, Liang Zhang, Linjian Mo, 11 Aug 2025, MCPToolBench++: A Large Scale AI Agent Model Context Protocol MCP Tool Use Benchmark, https://arxiv.org/abs/2508.07575
Wenpeng Xing, Zhipeng Chen, Changting Lin, Meng Han, 11 Aug 2025, HGMF: A Hierarchical Gaussian Mixture Framework for Scalable Tool Invocation within the Model Context Protocol, https://arxiv.org/abs/2508.07602
Luyao Zhuang, Qinggang Zhang, Huachi Zhou, Juhua Liu, Qing Li, Xiao Huang, 11 Aug 2025, LoSemB: Logic-Guided Semantic Bridging for Inductive Tool Retrieval, https://arxiv.org/abs/2508.07690
Keyan Ding, Jing Yu, Junjie Huang, Yuchen Yang, Qiang Zhang, Huajun Chen, 27 Jul 2025, SciToolAgent: A Knowledge Graph-Driven Scientific Agent for Multi-Tool Integration, https://arxiv.org/abs/2507.20280
Vicente Ramos (1), Sundous Hussein (1), Mohamed Abdel-Hafiz (1), Arunangshu Sarkar (2), Weixuan Liu (2), Katerina J. Kechris (2), Russell P. Bowler (3), Leslie Lange (4), Farnoush Banaei-Kashani (1) ((1) Department of Computer Science and Engineering, University of Colorado Denver, Denver, USA, (2) Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, USA, (3) Genomic Medicine Institute, Cleveland Clinic, Cleveland, USA, (4) Division of Biomedical Informatics and Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, USA), 27 Jul 2025, BioNeuralNet: A Graph Neural Network based Multi-Omics Network Data Analysis Tool, https://arxiv.org/abs/2507.20440
Nicola Croce, Tobin South, 26 Jul 2025, Trivial Trojans: How Minimal MCP Servers Enable Cross-Tool Exfiltration of Sensitive Data, https://arxiv.org/abs/2507.19880
Harsh Purohit, Tomoya Nishida, Kota Dohi, Takashi Endo, and Yohei Kawaguchi, 28 Jul 2025, MIMII-Agent: Leveraging LLMs with Function Calling for Relative Evaluation of Anomalous Sound Detection, https://arxiv.org/abs/2507.20666
Nicholas Botti (Federal Reserve Board), Flora Haberkorn (Federal Reserve Board), Charlotte Hoopes (Federal Reserve Board), Shaun Khan (Federal Reserve Board), 28 Jul 2025, Efficacy of AI RAG Tools for Complex Information Extraction and Data Annotation Tasks: A Case Study Using Banks Public Disclosures, https://arxiv.org/abs/2507.21360
Sergio Rojas-Galeano, 26 Jun 2025, Tool or Trouble? Exploring Student Attitudes Toward AI Coding Assistants, https://arxiv.org/abs/2507.22900
Yiya Diao, Changhe Li, Sanyou Zeng, Xinye Cai, Wenjian Luo, Shengxiang Yang, and Carlos A. Coello Coello, 30 Jul 2025, Nearest-Better Network for Visualizing and Analyzing Combinatorial Optimization Problems: A Unified Tool, https://arxiv.org/abs/2507.22440
Hongjin Qian, Zheng Liu, 1 Aug 2025, MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning, https://arxiv.org/abs/2508.00271
Amur Ghose, Andrew B. Kahng, Sayak Kundu, and Zhiang Wang, 1 Aug 2025, ORFS-agent: Tool-Using Agents for Chip Design Optimization, https://arxiv.org/abs/2506.08332
Eric Hirsch and Christian Friedrich, 31 Jul 2025, Data-driven tool wear prediction in milling, based on a process-integrated single-sensor approach, https://arxiv.org/abs/2412.19950
Michelle S. Lam, Fred Hohman, Dominik Moritz, Jeffrey P. Bigham, Kenneth Holstein, Mary Beth Kery, 1 Aug 2025, Policy Maps: Tools for Guiding the Unbounded Space of LLM Behaviors, https://arxiv.org/abs/2409.18203
Guozhao Mo, Wenliang Zhong, Jiawei Chen, Xuanang Chen, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han, Le Sun, 3 Aug 2025, LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?, https://arxiv.org/abs/2508.01780
Kanghua Mo, Li Hu, Yucheng Long, Zhihao Li, 4 Aug 2025, Attractive Metadata Attack: Inducing LLM Agents to Invoke Malicious Tools, https://arxiv.org/abs/2508.02110
Nys Tjade Siegel, James H. Cole, Mohamad Habes, Stefan Haufe, Kerstin Ritter, Marc-Andr\'e Schulz, 4 Aug 2025, Explainable AI Methods for Neuroimaging: Systematic Failures of Common Tools, the Need for Domain-Specific Validation, and a Proposal for Safe Application, https://arxiv.org/abs/2508.02560
Ashutosh Hathidara, Julien Yu, Sebastian Schreiber, 4 Aug 2025, Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky, https://arxiv.org/abs/2507.03336
Sunil Kumar, Bowen Zhao, Leo Dirac, Paulina Varshavskaya, 2 Aug 2025, Reinforcing VLMs to Use Tools for Detailed Visual Reasoning Under Resource Constraints, https://arxiv.org/abs/2506.14821
Peng Ding, Rick Stevens, 5 Aug 2025, Unified Tool Integration for LLMs: A Protocol-Agnostic Approach to Function Calling, https://arxiv.org/abs/2508.02979
Zikun Cui, Tianyi Huang, Chia-En Chiang, Cuiqianhe Du, 5 Aug 2025, Toward Verifiable Misinformation Detection: A Multi-Tool LLM Agent Framework, https://arxiv.org/abs/2508.03092
Shaofeng Yin, Ting Lei, Yang Liu, 5 Aug 2025, ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools, https://arxiv.org/abs/2508.03284
Khaled Bachir Delassi (1), Lakhdar Zeggane (1), Hadda Cherroun (1), Abdelhamid Haouhat (1), Kaoutar Bouzouad (2) ((1) LIM Lab, Amar Telidji University, Laghouat, Algeria, (2) Computer Science Dept., USTHB, Algiers, Algeria), 5 Aug 2025, VQA support to Arabic Language Learning Educational Tool, https://arxiv.org/abs/2508.03488
Zexiong Ma, Chao Peng, Qunhong Zeng, Pengfei Gao, Yanzhen Zou, Bing Xie, 5 Aug 2025, Tool-integrated Reinforcement Learning for Repo Deep Search, https://arxiv.org/abs/2508.03012
Peng Ding, 11 Jul 2025, ToolRegistry: A Protocol-Agnostic Tool Management Library for Function-Calling LLMs, https://arxiv.org/abs/2507.10593
Si Chen, Izzy Molnar, Ting Hua, Peiyu Li, Le Huy Khiem, G. Alex Ambrose, Jim Lang, Ronald Metoyer, Nitesh V. Chawla, 6 Aug 2025, \textsc{SimInstruct}: A Responsible Tool for Collecting Scaffolding Dialogues Between Experts and LLM-Simulated Novices, https://arxiv.org/abs/2508.04428
Manuela Schuler, 6 Aug 2025, A Visual Tool for Interactive Model Explanation using Sensitivity Analysis, https://arxiv.org/abs/2508.04269
Zhejun Zhao, Yuehu Dong, Alley Liu, Lixue Zheng, Pingsheng Liu, Dongdong Shen, Long Xia, Jiashu Zhao, Dawei Yin, 6 Aug 2025, TURA: Tool-Augmented Unified Retrieval Agent for AI Search, https://arxiv.org/abs/2508.04604
Natalia Echeverry and Arun Lekshmi Narayanan, 6 Aug 2025, How are CS students using resources and AI tools for coding tasks?, https://arxiv.org/abs/2508.04667
Rafael Salinas-Buestan, Otto Parra, Nelly Condori-Fernandez, Maria Fernanda Granda, 22 Jul 2025, Evaluating Generative AI Tools for Personalized Offline Recommendations: A Comparative Study, https://arxiv.org/abs/2508.03710
Linfeng Gao, Yaoxiang Wang, Minlong Peng, Jialong Tang, Yuzhe Shang, Mingming Sun, Jinsong Su, 7 Aug 2025, Tool Graph Retriever: Exploring Dependency Graph-based Tool Retrieval for Large Language Models, https://arxiv.org/abs/2508.05152
Hannah-Beth Clark, Laura Benton, Emma Searle, Margaux Dowland, Matthew Gregory, Will Gayne and John Roberts, 7 Aug 2025, Building Effective Safety Guardrails in AI Education Tools, https://arxiv.org/abs/2508.05360
Sahil Bansal, Sai Shruthi Sistla, Aarti Arikatala, Sebastian Schreiber, 7 Aug 2025, Planning Agents on an Ego-Trip: Leveraging Hybrid Ego-Graph Ensembles for Improved Tool Retrieval in Enterprise Task Planning, https://arxiv.org/abs/2508.05888
Chandler Campbell, Bernie Boscoe, Tuan Do, 25 Jul 2025, AquiLLM: a RAG Tool for Capturing Tacit Knowledge in Research Groups, https://arxiv.org/abs/2508.05648
Jiaxuan Liang, Shide Zhou, and Kailong Wang, 26 Jul 2025, OmniBench-RAG: A Multi-Domain Evaluation Platform for Retrieval-Augmented Generation Tools, https://arxiv.org/abs/2508.05650
Xianghe Pang, Shuo Tang, Rui Ye, Yuwen Du, Yaxin Du, Siheng Chen, 12 Aug 2025, BrowseMaster: Towards Scalable Web Browsing via Tool-Augmented Programmatic Agent Pair, https://arxiv.org/abs/2508.09129
Junjie Ye, Changhao Jiang, Zhengyin Du, Yufei Xu, Xuesong Yao, Zhiheng Xi, Xiaoran Fan, Qi Zhang, Xuanjing Huang, Jiecao Chen, 12 Aug 2025, Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments, https://arxiv.org/abs/2508.08791
Jiawei Zhou, Amy Z. Chen, Darshi Shah, Laura M. Schwab Reese, and Munmun De Choudhury, 11 Aug 2025, A Risk Taxonomy and Reflection Tool for Large Language Model Adoption in Public Health, https://arxiv.org/abs/2411.02594
Yanming Liu, Xinyue Peng, Jiannan Cao, Yuwei Zhang, Xuhong Zhang, Sheng Cheng, Xun Wang, Jianwei Yin, Tianyu Du, 15 Aug 2025, Tool-Planner: Task Planning with Clusters across Multiple Tools, https://arxiv.org/abs/2406.03807
Wenjie Chen, Wenbin Li, Di Yao, Xuying Meng, Chang Gong, Jingping Bi, 18 Aug 2025, GTool: Graph Enhanced Tool Planning with Large Language Model, https://arxiv.org/abs/2508.12725
Guangfu Hao, Haojie Wen, Liangxuan Guo, Yang Chen, Yanchao Bi, Shan Yu, 18 Aug 2025, Flexible Tool Selection through Low-dimensional Attribute Alignment of Vision and Language, https://arxiv.org/abs/2505.22146
Chao Tang, Anxing Xiao, Yuhong Deng, Tianrun Hu, Wenlong Dong, Hanbo Zhang, David Hsu, Hong Zhang, 19 Aug 2025, MimicFunc: Imitating Tool Manipulation from a Single Human Video via Functional Correspondence, https://arxiv.org/abs/2508.13534
Wenxin Jiang, Mingyu Kim, Chingwo Cheung, Heesoo Kim, George K. Thiruvathukal, James C. Davis, 18 Aug 2025, "I see models being a whole other thing": An Empirical Study of Pre-Trained Model Naming Conventions and A Tool for Enhancing Naming Consistency, https://arxiv.org/abs/2310.01642
Zhongzhou Chen, 20 Aug 2025, Reliable generation of isomorphic physics problems using ChatGPT with prompt-chaining and tool use, https://arxiv.org/abs/2508.14755
Lixiang Yan, 20 Aug 2025, From Passive Tool to Socio-cognitive Teammate: A Conceptual Framework for Agentic AI in Human-AI Collaborative Learning, https://arxiv.org/abs/2508.14825
Hengyu An, Jinghuai Zhang, Tianyu Du, Chunyi Zhou, Qingming Li, Tao Lin, Shouling Ji, 21 Aug 2025, IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents, https://arxiv.org/abs/2508.15310
Yufeng Zhao, Junnan Liu, Hongwei Liu, Dongsheng Zhu, Yuan Shen, Songyang Zhang, Kai Chen, 21 Aug 2025, Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis, https://arxiv.org/abs/2508.15754
Zhiqiang Wang, Yichao Gao, Yanting Wang, Suyuan Liu, Haifeng Sun, Haoran Cheng, Guanquan Shi, Haohua Du, Xiangyang Li, 19 Aug 2025, MCPTox: A Benchmark for Tool Poisoning Attack on Real-World MCP Servers, https://arxiv.org/abs/2508.14925
Vishnou Vinayagame, Gregory Senay, and Luis Mart\'i, 20 Aug 2025, MATATA: Weakly Supervised End-to-End MAthematical Tool-Augmented Reasoning for Tabular Applications, https://arxiv.org/abs/2411.18915
Fei Lei, Yibo Yang, Wenxiu Sun, Dahua Lin, 22 Aug 2025, MCPVerse: An Expansive, Real-World Benchmark for Agentic Tool Use, https://arxiv.org/abs/2508.16260
Eduardo de Conto, Blaise Genest, Arvind Easwaran, Nicholas Ng, Shweta Menon, 25 Aug 2025, DesCartes Builder: A Tool to Develop Machine-Learning Based Digital Twins, https://arxiv.org/abs/2508.17988
Thao Le, Tim Miller, Ruihan Zhang, Liz Sonenberg, Ronal Singh, 25 Aug 2025, Visual Evaluative AI: A Hypothesis-Driven Tool with Concept-Based Explanations and Weight of Evidence, https://arxiv.org/abs/2407.04710
Bingguang Hao, Maolin Wang, Zengzhuang Xu, Yicheng Chen, Cunyin Peng, Jinjie GU, Chenyi Zhuang, 7 Aug 2025, Exploring Superior Function Calls via Reinforcement Learning, https://arxiv.org/abs/2508.05118
Alejandro \'Alvarez Castro and Joaqu\'in Ordieres-Mer\'e, 25 Aug 2025, Multimodal Proposal for an AI-Based Tool to Increase Cross-Assessment of Messages, https://arxiv.org/abs/2509.03529
Safouane El Ghazouali, Umberto Michelucci, 4 Sep 2025, VisioFirm: Cross-Platform AI-assisted Annotation Tool for Computer Vision, https://arxiv.org/abs/2509.04180
Aditya Mittal, Taha Abdullah, Arjun Ashok, Brandon Zarate Estrada, Shubhada Martha, Billy Ouattara, Jonathan Tran, and Norman Matloff, 4 Sep 2025, dsld: A Socially Relevant Tool for Teaching Statistics, https://arxiv.org/abs/2411.04228
Justin Lin and Julia Fukuyama, 4 Sep 2025, An Interactive Tool for Analyzing High-Dimensional Clusterings, https://arxiv.org/abs/2509.04603
Weikang Zhao, Xili Wang, Chengdi Ma, Lingbin Kong, Zhaohua Yang, Mingxiang Tuo, Xiaowei Shi, Yitao Zhai, Xunliang Cai, 26 Aug 2025, MUA-RL: Multi-turn User-interacting Agent Reinforcement Learning for agentic tool use, https://arxiv.org/abs/2508.18669
Heng Lin and Zhongwen Xu, 26 Aug 2025, Understanding Tool-Integrated Reasoning, https://arxiv.org/abs/2508.19201
Dimitrios Mallis, Ahmet Serdar Karadeniz, Sebastian Cavada, Danila Rukhovich, Niki Foteinopoulou, Kseniya Cherenkova, Anis Kacem, Djamila Aouada, 26 Aug 2025, CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers, https://arxiv.org/abs/2412.13810
Junjie Ye, Yilong Wu, Sixian Li, Yuming Yang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Peng Wang, Zhongchao Shi, Jianping Fan, Zhengyin Du, 26 Aug 2025, TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use, https://arxiv.org/abs/2412.15495
Serena Hughes, Timothy Hamilton, Tom Kolokotrones and Eric J. Deeds, 26 Aug 2025, DeepAtlas: a tool for effective manifold learning, https://arxiv.org/abs/2508.19479
Sam Houliston and Ambroise Odonnat and Charles Arnal and Vivien Cabannes, 28 Aug 2025, Provable Benefits of In-Tool Learning for Large Language Models, https://arxiv.org/abs/2508.20755
Lev Tankelevitch, Elena L. Glassman, Jessica He, Aniket Kittur, Mina Lee, Srishti Palani, Advait Sarkar, Gonzalo Ramos, Yvonne Rogers, Hari Subramonyam, 28 Aug 2025, Understanding, Protecting, and Augmenting Human Cognition with Generative AI: A Synthesis of the CHI 2025 Tools for Thought Workshop, https://arxiv.org/abs/2508.21036
Dayu Wang, Jiaye Yang, Weikang Li, Jiahui Liang, Yang Li, 28 Aug 2025, MSARL: Decoupling Reasoning and Tool Use with Multi-Small-Agent Reinforcement Learning, https://arxiv.org/abs/2508.08882
Jo\~ao Valente, Atabak Dehban, Rodrigo Ventura, 29 Aug 2025, CAD2DMD-SET: Synthetic Generation Tool of Digital Measurement Device CAD Model Datasets for fine-tuning Large Vision-Language Models, https://arxiv.org/abs/2508.21732
Maya Guhan (1), Meghan E. Hurley (1), Eric A. Storch (2), John Herrington (3), Casey Zampella (3), Julia Parish-Morris (3), Gabriel L\'azaro-Mu\~noz (4), and Kristin Kostick-Quenet (1) ((1) Center for Ethics and Health Policy, Baylor College of Medicine, Houston, TX, USA, (2) Department of Psychiatry and Behavioral Sciences, Baylor College of Medicine, Houston, TX, USA, (3) Department of Child and Adolescent Psychiatry and Behavioral Sciences, Children's Hospital of Philadelphia, Philadelphia, PA, USA, (4) Center for Bioethics, Harvard Medical School, Boston, MA, USA), 29 Aug 2025, Developer Insights into Designing AI-Based Computer Perception Tools, https://arxiv.org/abs/2508.21733
Navid Aftabi, Abhishek Hanchate, Satish Bukkapatnam, and Dan Li, 29 Aug 2025, DynaMark: A Reinforcement Learning Framework for Dynamic Watermarking in Industrial Machine Tool Controllers, https://arxiv.org/abs/2508.21797
Jorge Saldivar, Anna Gatzioura, Carlos Castillo, 28 Aug 2025, Synthetic CVs To Build and Test Fairness-Aware Hiring Tools, https://arxiv.org/abs/2508.21179
Dongfu Jiang, Yi Lu, Zhuofeng Li, Zhiheng Lyu, Ping Nie, Haozhe Wang, Alex Su, Hui Chen, Kai Zou, Chao Du, Tianyu Pang, Wenhu Chen, 1 Sep 2025, VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use, https://arxiv.org/abs/2509.01055
Caterina Fuster-Barcelo, Gonzalo R. Rios-Munoz, and Arrate Munoz-Barrutia, 2 Sep 2025, Scaffolding Collaborative Learning in STEM: A Two-Year Evaluation of a Tool-Integrated Project-Based Methodology, https://arxiv.org/abs/2509.02355
Zhenghai Xue, Longtao Zheng, Qian Liu, Yingru Li, Xiaosen Zheng, Zejun Ma, Bo An, 2 Sep 2025, SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning, https://arxiv.org/abs/2509.02479
Seungkyu Lee, Nalim Kim, Yohan Jo, 1 Sep 2025, In-N-Out: A Parameter-Level API Graph Dataset for Tool Agents, https://arxiv.org/abs/2509.01560
Sheng Ye, Jiyu Li, Yifan Chai, Lin Liu, Murugesu Sivapalan, Qihua Ran, 2 Sep 2025, Using explainable artificial intelligence (XAI) as a diagnostic tool: An application for deducing hydrologic connectivity at watershed scale, https://arxiv.org/abs/2509.02127
Md Zahid Hasan, Guillermo Basulto-Elias, Jun Ha Chang, Sahuna Hallmark, Matthew Rizzo, Anuj Sharma, Soumik Sarkar, 31 Aug 2025, Driving as a Diagnostic Tool: Scenario-based Cognitive Assessment in Older Drivers from Driving Video, https://arxiv.org/abs/2507.05463
Md Hasebul Hasan, Mahir Labib Dihan, Mohammed Eunus Ali and Md Rizwan Parvez, 7 Sep 2025, MapAgent: A Hierarchical Agent for Geospatial Reasoning with Dynamic Map Tool Integration, https://arxiv.org/abs/2509.05933
Chuang Jiang (1), Mingyue Cheng (1), Xiaoyu Tao (1), Qingyang Mao (1), Jie Ouyang (1), Qi Liu (1) ((1) State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China, Hefei, China), 8 Sep 2025, TableMind: An Autonomous Programmatic Agent for Tool-Augmented Table Reasoning, https://arxiv.org/abs/2509.06278
Isac Holm, 29 Aug 2025, VILOD: A Visual Interactive Labeling Tool for Object Detection, https://arxiv.org/abs/2509.05317
Abdollah Baghaei Daemei, 4 Sep 2025, Prototyping an AI-powered Tool for Energy Efficiency in New Zealand Homes, https://arxiv.org/abs/2509.05364
Andrej Orsula, Matthieu Geist, Miguel Olivares-Mendez, Carol Martinez, 5 Sep 2025, Learning Tool-Aware Adaptive Compliant Control for Autonomous Regolith Excavation, https://arxiv.org/abs/2509.05475
Yu Liu, Yuchong Xie, Mingyu Luo, Zesen Liu, Zhixiang Zhang, Kaikai Zhang, Zongjie Li, Ping Chen, Shuai Wang, Dongdong She, 6 Sep 2025, Exploit Tool Invocation Prompt for Tool Behavior Hijacking in LLM-Based Agentic System, https://arxiv.org/abs/2509.05755
Muraam Abdel-Ghani, Mahmoud Ali, Mohamed Ali, Fatmaelzahraa Ahmed, Mohamed Arsalan, Abdulaziz Al-Ali, Shidin Balakrishnan, 7 Sep 2025, FASL-Seg: Anatomy and Tool Segmentation of Surgical Scenes, https://arxiv.org/abs/2509.06159
Ignacio de Gregorio Noblejas September 14, 2025, Money-Printing AI Ideas: AI Businesses that Will Print Money, https://thewhitebox.beehiiv.com/p/money-printing-ai-ideas
Jiajun Chai, Guojun Yin, Zekun Xu, Chuhuai Yue, Yi Jia, Siyu Xia, Xiaohan Wang, Jiwen Jiang, Xiaoguang Li, Chengqi Dong, Hang He, Wei Lin, 31 Aug 2025, RLFactory: A Plug-and-Play Reinforcement Learning Post-Training Framework for LLM Multi-Turn Tool-Use, https://arxiv.org/abs/2509.06980
Zikang Guo, Benfeng Xu, Chiwei Zhu, Wentao Hong, Xiaorui Wang, Zhendong Mao, 10 Sep 2025, MCP-AgentBench: Evaluating Real-World Language Agent Performance with MCP-Mediated Tools, https://arxiv.org/abs/2509.09734
Jayachandu Bandlamudi, Ritwik Chaudhuri, Neelamadhav Gantayat, Sambit Ghosh, Kushal Mukherjee, Prerna Agarwal, Renuka Sindhgatta, Sameep Mehta, 12 Sep 2025, A Framework for Testing and Adapting REST APIs as LLM Tools, https://arxiv.org/abs/2504.15546
Natallia Kokash, Bernard de Bono and Tom Gillespie, 19 Sep 2025, Ontology Creation and Management Tools: the Case of Anatomical Connectivity, https://arxiv.org/abs/2509.15780
Yuchen Zhang, Mohammad Mohammadi Amiri, 19 Sep 2025, Toward Efficient Influence Function: Dropout as a Compression Tool, https://arxiv.org/abs/2509.15651
Shubham Kavane, Kajol Kulkarni, Harald Koestler, 17 Sep 2025, ChannelFlow-Tools: A Standardized Dataset Creation Pipeline for 3D Obstructed Channel Flows, https://arxiv.org/abs/2509.15236
Yabo Zhang, Yihan Zeng, Qingyun Li, Zhen Hu, Kavin Han, Wangmeng Zuo, 16 Sep 2025, Tool-R1: Sample-Efficient Reinforcement Learning for Agentic Tool Use, https://arxiv.org/abs/2509.12867
Prerna Agarwal, Himanshu Gupta, Soujanya Soni, Rohith Vallam, Renuka Sindhgatta, Sameep Mehta, 15 Sep 2025, Automated Creation and Enrichment Framework for Improved Invocation of Enterprise APIs as Tools, https://arxiv.org/abs/2509.11626
Payam Latifi, 15 Sep 2025, Is 'Hope' a person or an idea? A pilot benchmark for NER: comparing traditional NLP tools and large language models on ambiguous entities, https://arxiv.org/abs/2509.12098
Haonan Chen, Cheng Zhu, Shuijing Liu, Yunzhu Li, Katherine Driggs-Campbell, 14 Sep 2025, Tool-as-Interface: Learning Robot Policies from Observing Human Tool Use, https://arxiv.org/abs/2504.04612
Zihao Feng, Xiaoxue Wang, Bowen Wu, Hailong Cao, Tiejun Zhao, Qun Yu, Baoxun Wang, 18 Sep 2025, ToolSample: Dual Dynamic Sampling Methods with Curriculum Learning for RL-based Tool Learning, https://arxiv.org/abs/2509.14718
Weiting Tan, Xinghua Qu, Ming Tu, Meng Ge, Andy T. Liu, Philipp Koehn, Lu Lu, 17 Sep 2025, Process-Supervised Reinforcement Learning for Interactive Multimodal Tool-Use Agents, https://arxiv.org/abs/2509.14480
Yating Lin, Zixuan Huang, Fan Yang, Dmitry Berenson, 18 Sep 2025, AnoF-Diff: One-Step Diffusion-Based Anomaly Detection for Forceful Tool Use, https://arxiv.org/abs/2509.15153
Sukhdeep Bal, Emma Colbourne, Jasmine Gan, Ludovica Griffanti, Taylor Hanayik, Nele Demeyere, Jim Davies, Sarah T Pendlebury, Mark Jenkinson, 8 Sep 2025, Validation of a CT-brain analysis tool for measuring global cortical atrophy in older patient cohorts, https://arxiv.org/abs/2509.08012
Francesco Blefari, Cristian Cosentino, Francesco Aurelio Pironti, Angelo Furfaro, Fabrizio Marozzo, 10 Sep 2025, CyberRAG: An Agentic RAG cyber attack classification and reporting tool, https://arxiv.org/abs/2507.02424
Anis Koubaa and Khaled Gabr, 14 Sep 2025, Agentic UAVs: LLM-Driven Autonomy with Integrated Tool-Calling and Cognitive Reasoning, https://arxiv.org/abs/2509.13352
Harper Reed, Michael Sugimura, Angelo Zangari, 16 Sep 2025, AI Agents with Human-Like Collaborative Tools: Adaptive Strategies for Enhanced Problem-Solving, https://arxiv.org/abs/2509.13547
Qikai Chang, Zhenrong Zhang, Pengfei Hu, Jiefeng Ma, Yicheng Pan, Jianshu Zhang, Jun Du, Quan Liu, Jianqing Gao, 17 Sep 2025, THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning, https://arxiv.org/abs/2509.13761
Charlotte Beylier, Parvaneh Joharinad, J\"urgen Jost, Nahid Torbati, 16 Sep 2025, Curvature as a tool for evaluating dimensionality reduction and estimating intrinsic dimension, https://arxiv.org/abs/2509.13385
Jianzhang Zhang, Jialong Zhou, Chuang Liu, 24 Sep 2025, OR-Toolformer: Modeling and Solving Operations Research Problems with Tool Augmented Large Language Models, https://arxiv.org/abs/2510.01253
Diptyaroop Maji, Kang Yang, Prashant Shenoy, Ramesh K Sitaraman, Mani Srivastava, 1 Oct 2025, CarbonX: An Open-Source Tool for Computational Decarbonization Using Time Series Foundation Models, https://arxiv.org/abs/2510.01521
Aadarsh Rajiv, Klaus Mueller, 27 Aug 2025, LegiScout: A Visual Tool for Understanding Complex Legislation, https://arxiv.org/abs/2510.01195
Tyler J Poore, Christopher J Pinard, Aleena Shabbir, Andrew Lagree, Andre Telfer, Kuan-Chuen Wu, 22 Sep 2025, Context Matters: Comparison of commercial large language tools in veterinary medicine, https://arxiv.org/abs/2510.01224
Yongchao Chen, Jiefeng Chen, Rui Meng, Ji Yin, Na Li, Chuchu Fan, Chi Wang, Tomas Pfister, Jinsung Yoon, 30 Sep 2025, TUMIX: Multi-Agent Test-Time Scaling with Tool-Use Mixture, https://arxiv.org/abs/2510.01279
Viraj Prabhu, Yutong Dai, Matthew Fernandez, Jing Gu, Krithika Ramakrishnan, Yanqi Luo, Silvio Savarese, Caiming Xiong, Junnan Li, Zeyuan Chen, Ran Xu, 1 Oct 2025, WALT: Web Agents that Learn Tools, https://arxiv.org/abs/2510.01524
Yaxin Du, Yuanshuo Zhang, Xiyuan Yang, Yifan Zhou, Cheng Wang, Gongyi Zou, Xianghe Pang, Wenhao Wang, Menglan Chen, Shuo Tang, Zhiyu Li, Siheng Chen, 2 Oct 2025, InfoMosaic-Bench: Evaluating Multi-Source Information Seeking in Tool-Augmented Agents, https://arxiv.org/abs/2510.02271
Hyunji Min, Sangwon Jung, Junyoung Sung, Dosung Lee, Leekyeung Han, Paul Hongsuck Seo, 14 Oct 2025, GOAT: A Training Framework for Goal-Oriented Agent with Tools, https://arxiv.org/abs/2510.12218
Ajith Anil Meera, Abian Torres and Pablo Lanillos, 14 Oct 2025, Designing Tools with Control Confidence, https://arxiv.org/abs/2510.12630
Xingang Guo, Utkarsh Tyagi, Advait Gosai, Paula Vergara, Ernesto Gabriel Hern\'andez Montoya, Chen Bo Calvin Zhang, Bin Hu, Yunzhong He, Bing Liu, Rakshith Sharma Srinivasa, 14 Oct 2025, Beyond Seeing: Evaluating Multimodal LLMs on Tool-Enabled Image Perception, Transformation, and Reasoning, https://arxiv.org/abs/2510.12712
Quy Minh Le, Minh Sao Khue Luu, Khanh-Tung Tran, Duc-Hai Nguyen, Hoang-Quoc-Viet Pham, Quan Le, Hoang Thanh Lam, Hoang D. Nguyen, 24 Sep 2025, ToolBrain: A Flexible Reinforcement Learning Framework for Agentic Tools, https://arxiv.org/abs/2510.00023
Thierry Blankenstein, Jialin Yu, Zixuan Li, Vassilis Plachouras, Sunando Sengupta, Philip Torr, Yarin Gal, Alasdair Paren, Adel Bibi, 30 Sep 2025, BiasBusters: Uncovering and Mitigating Tool Selection Bias in Large Language Models, https://arxiv.org/abs/2510.00307
Cong Yu, Valter Uotila, Shilong Deng, Qingyuan Wu, Tuo Shi, Songlin Jiang, Lei You, Bo Zhao, 1 Oct 2025, QUASAR: Quantum Assembly Code Generation Using Tool-Augmented LLMs via Agentic RL, https://arxiv.org/abs/2510.00967
Zhangchen Xu, Adriana Meza Soria, Shawn Tan, Anurag Roy, Ashish Sunil Agrawal, Radha Poovendran, Rameswar Panda, 1 Oct 2025, TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments, https://arxiv.org/abs/2510.01179
Daniele Bifolco, Guido Annicchiarico, Pierluigi Barbiero, Massimiliano Di Penta, Fiorella Zampetti, 1 Oct 2025, CodeGenLink: A Tool to Find the Likely Origin and License of Automatically Generated Code, https://arxiv.org/abs/2510.01077
Changhyun Jeon, Jinhee Park, Jungwoo Choi, Keonwoo Kim, Jisu Kim, Minji Hong, 19 Sep 2025, SLM-Based Agentic AI with P-C-G: Optimized for Korean Tool Use, https://arxiv.org/abs/2509.19369
Michael Sullivan, Mareike Hartmann, Alexander Koller, 24 Sep 2025, Procedural Environment Generation for Tool-Use Agents, https://arxiv.org/abs/2506.11045
Yuyang Liu, Xinyuan Shi, Xiaondan Liang, 24 Sep 2025, COLT: Enhancing Video Large Language Models with Continual Tool Usage, https://arxiv.org/abs/2509.18754
Wenhao Wang, Peizhi Niu, Zhao Xu, Zhaoyu Chen, Jian Du, Yaxin Du, Xianghe Pang, Keduan Huang, Yanfeng Wang, Qiang Yan, Siheng Chen, 28 Oct 2025, MCP-Flow: Facilitating LLM Agents to Master Real-World, Diverse and Scaling MCP Tools, https://arxiv.org/abs/2510.24284
Zengzhuang Xu, Bingguang Hao, Zechuan Wang, Yuntao Wen, Maolin Wang, Yang Liu, Long Chen, Dong Wang, Yicheng Chen, Cunyin Peng, Chenyi Zhuang, Jinjie Gu, Leilei Gan, Xiangyu Zhao, Shi Gu, 28 Oct 2025, FunReason-MT Technical Report: Overcoming the Complexity Barrier in Multi-Turn Function Calling, https://arxiv.org/abs/2510.24645
Yifu Lu, Shengjie Liu, Li Dong, 28 Oct 2025, OrchDAG: Complex Tool Orchestration in Multi-Turn Interactions with Plan DAGs, https://arxiv.org/abs/2510.24663
Shengjie Liu, Li Dong, Zhenyu Zhang, 28 Oct 2025, Bridging Tool Dependencies and Domain Knowledge: A Graph-Based Framework for In-Context Planning, https://arxiv.org/abs/2510.24690
Arpita Kundu, Joyita Chakraborty, Anindita Desarkar, Aritra Sen, Srushti Anil Patil and Vishwanathan Raman, 28 Oct 2025, V-SAT: Video Subtitle Annotation Tool, https://arxiv.org/abs/2510.24180
Roham Koohestani, Philippe de Bekker, Beg\"um Ko\c{c}, Maliheh Izadi, 28 Oct 2025, Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Unified Approach for Elevating Benchmark Quality, https://arxiv.org/abs/2503.05860
ChangSu Choi, Hoyun Song, Dongyeon Kim, WooHyeon Jung, Minkyung Cho, Sunjin Park, NohHyeob Bae, Seona Yu, KyungTae Lim, 28 Oct 2025, MENTOR: A Reinforcement Learning Framework for Enabling Tool Use in Small Models via Teacher-Optimized Rewards, https://arxiv.org/abs/2510.18383
Mingliang Zhai, Hansheng Liang, Xiaomeng Fan, Zhi Gao, Chuanhao Li, Che Sun, Xu Bin, Yuwei Wu, Yunde Jia, 23 Oct 2025, Multi-Step Reasoning for Embodied Question Answering via Tool Augmentation, https://arxiv.org/abs/2510.20310
Neil Maiden, Konstantinos Zachos, James Lockerbie, Kostas Petrianakis, Amanda Brown, 23 Oct 2025, A computational model and tool for generating more novel opportunities in professional innovation processes, https://arxiv.org/abs/2510.20402
Fardin Ganjkhanloo, Emmett Springer, Erik H. Hoyer, Daniel L. Young, Kimia Ghobadi, 23 Oct 2025, Optimizing Clinical Fall Risk Prediction: A Data-Driven Integration of EHR Variables with the Johns Hopkins Fall Risk Assessment Tool, https://arxiv.org/abs/2510.20714
Chengpeng Li, Zhengyang Tang, Ziniu Li, Mingfeng Xue, Keqin Bao, Tian Ding, Ruoyu Sun, Benyou Wang, Xiang Wang, Junyang Lin and Dayiheng Liu, 23 Oct 2025, Teaching Language Models to Reason with Tools, https://arxiv.org/abs/2510.20342
Hassan Hamad, Yingru Xu, Liang Zhao, Wenbo Yan, Narendra Gyanchandani, 19 Oct 2025, ToolCritic: Detecting and Correcting Tool-Use Errors in Dialogue Systems, https://arxiv.org/abs/2510.17052
Kiran Kate, Yara Rizk, Poulami Ghosh, Ashu Gulati, Tathagata Chakraborti, Zidane Wright, Mayank Agarwal, 10 Oct 2025, How Good Are LLMs at Processing Tool Outputs?, https://arxiv.org/abs/2510.15955
Hy Dang, Tianyi Liu, Zhuofeng Wu, Jingfeng Yang, Haoming Jiang, Tao Yang, Pei Chen, Zhengyang Wang, Helen Wang, Huasheng Li, Bing Yin, Meng Jiang, 22 Sep 2025, Improving Large Language Models Function Calling and Interpretability via Guided-Structured Templates, https://arxiv.org/abs/2509.18076
Weihua Du, Hailei Gong, Zhan Ling, Kang Liu, Lingfeng Shen, Xuesong Yao, Yufei Xu, Dingyuan Shi, Yiming Yang, Jiecao Chen, 22 Sep 2025, Generalizable End-to-End Tool-Use RL with Synthetic CodeGym, https://arxiv.org/abs/2509.17325
Jeonghyun Lee, Jui-Tse Hung, Meryem Yilmaz Soylu, Diana Popescu, Christopher Zhang Cui, Gayane Grigoryan, David A Joyner, Stephen W Harmon, 18 Sep 2025, Socratic Mind: Impact of a Novel GenAI-Powered Assessment Tool on Student Learning and Higher-Order Thinking, https://arxiv.org/abs/2509.16262
Zhenlan Ji, Daoyuan Wu, Wenxuan Wang, Pingchuan Ma, Shuai Wang, Lei Ma, 18 Sep 2025, Digging Into the Internal: Causality-Based Analysis of LLM Function Calling, https://arxiv.org/abs/2509.16268
Kazem Faghih, Wenxiao Wang, Yize Cheng, Siddhant Bharti, Gaurang Sriramanan, Sriram Balasubramanian, Parsa Hosseini, Soheil Feizi, 21 Sep 2025, Tool Preferences in Agentic LLMs are Unreliable, https://arxiv.org/abs/2505.18135
Deuksin Kwon, Jiwon Hae, Emma Clift, Daniel Shamsoddini, Jonathan Gratch, Gale M. Lucas, 19 Sep 2025, ASTRA: A Negotiation Agent with Adaptive and Strategic Reasoning via Tool-integrated Action for Dynamic Offer Optimization, https://arxiv.org/abs/2503.07129
Vishvesh Bhat, Omkar Ghugarkar, Julian McAuley, 27 Oct 2025, On Generalization in Agentic Tool Calling: CoreThink Agentic Reasoner and MAVEN Dataset, https://arxiv.org/abs/2510.22898
Chenlong Yin, Zeyang Sha, Shiwen Cui, Changhua Meng, 27 Oct 2025, The Reasoning Trap: How Enhancing LLM Reasoning Amplifies Tool Hallucination, https://arxiv.org/abs/2510.22977
Ran Xu, Jingjing Chen, Jiayu Ye, Yu Wu, Jun Yan, Carl Yang, Hongkun Yu, 27 Oct 2025, Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning, https://arxiv.org/abs/2510.23038
Daoyu Wang, Mingyue Cheng, Qi Liu, Shuo Yu, Zirui Liu, Ze Guo, 27 Oct 2025, PaperArena: An Evaluation Benchmark for Tool-Augmented Agentic Reasoning on Scientific Literature, https://arxiv.org/abs/2510.10909
Qianben Chen and Jingyi Cao and Jiayu Zhang and Tianrui Qin and Xiaowan Li and King Zhu and Dingfeng Shi and He Zhu and Minghao Liu and Xiaobo Liang and Ge Zhang and Jian Yang and Yuchen Eleanor Jiang and Wangchunshu Zhou, 13 Oct 2025, A\textsuperscript{2}FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning, https://arxiv.org/abs/2510.12838
Tajamul Ashraf, Umair Nawaz, Abdelrahman M. Shaker, Rao Anwer, Philip Torr, Fahad Shahbaz Khan, Salman Khan, 15 Oct 2025, MATRIX: Multimodal Agent Tuning for Robust Tool-Use Reasoning, https://arxiv.org/abs/2510.08567
Wentao Zhang, Liang Zeng, Yuzhen Xiao, Yongcong Li, Ce Cui, Yilei Zhao, Rui Hu, Yang Liu, Yahui Zhou, Bo An, 26 Sep 2025, AgentOrchestra: Orchestrating Hierarchical Multi-Agent Intelligence with the Tool-Environment-Agent(TEA) Protocol, https://arxiv.org/abs/2506.12508
Zhanke Zhou, Chentao Cao, Xiao Feng, Xuan Li, Zongze Li, Xiangyu Lu, Jiangchao Yao, Weikai Huang, Linrui Xu, Tian Cheng, Guanyu Jiang, Yiming Zheng, Brando Miranda, Tongliang Liu, Sanmi Koyejo, Masashi Sugiyama, Bo Han, 5 Oct 2025, AlphaApollo: Orchestrating Foundation Models and Professional Tools into a Self-Evolving System for Deep Agentic Reasoning, https://arxiv.org/abs/2510.06261
Qi Guo, Jianing Wang, Jianfei Zhang, Deyang Kong, Xiangzhou Huang, Xiangyu Xi, Wei Wang, Jingang Wang, Xunliang Cai, Shikun Zhang, and Wei Ye, 8 Oct 2025, Autoformalizer with Tool Feedback, https://arxiv.org/abs/2510.06857
Wenxun Wu, Yuanyang Li, Guhan Chen, Linyue Wang, Hongyang Chen, 8 Oct 2025, Tool-Augmented Policy Optimization: Synergizing Reasoning and Adaptive Tool Use with Reinforcement Learning, https://arxiv.org/abs/2510.07038
Tian Qin, Felix Bai, Ting-Yao Hu, Raviteja Vemulapalli, Hema Swetha Koppula, Zhiyang Xu, Bowen Jin, Mert Cemri, Jiarui Lu, Zirui Wang, Meng Cao, 8 Oct 2025, COMPASS: A Multi-Turn Benchmark for Tool-Mediated Planning & Preference Optimization, https://arxiv.org/abs/2510.07043
Suchismita Naik, Austin L. Toombs, Amanda Snellinger, Scott Saponas, Amanda K. Hall, 10 Sep 2025, Exploring Human-AI Collaboration Using Mental Models of Early Adopters of Multi-Agent Generative AI Tools, https://arxiv.org/abs/2510.06224
Aleksi Huotala, Miikka Kuutila, Olli-Pekka Turtio, Mika M\"antyl\"a, 8 Oct 2025, AISysRev - LLM-based Tool for Title-abstract Screening, https://arxiv.org/abs/2510.06708
Yi Han, Cheng Chi, Enshen Zhou, Shanyu Rong, Jingkun An, Pengwei Wang, Zhongyuan Wang, Lu Sheng, Shanghang Zhang, 8 Oct 2025, TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics, https://arxiv.org/abs/2510.07181
Wonjoong Kim, Sangwu Park, Yeonjun In, Sein Kim, Dongha Lee, Chanyoung Park, 3 Oct 2025, Beyond the Final Answer: Evaluating the Reasoning Trajectories of Tool-Augmented Agents, https://arxiv.org/abs/2510.02837
Jonathan Sneh, Ruomei Yan, Jialin Yu, Philip Torr, Yarin Gal, Sunando Sengupta, Eric Sommerlade, Alasdair Paren, Adel Bibi, 2 Oct 2025, ToolTweak: An Attack on Tool Selection in LLM-based Agents, https://arxiv.org/abs/2510.02554
Bo Ma, Hang Li, ZeHua Hu, XiaoFan Gui, LuYao Liu, Simon Liu, 3 Oct 2025, AgenticRAG: Tool-Augmented Foundation Models for Zero-Shot Explainable Recommender Systems, https://arxiv.org/abs/2510.02668
Zongze Wu, Yani Guo, Churong Liang, Runnan Li, 10 Oct 2025, GRETEL: A Goal-driven Retrieval and Execution-based Trial Framework for LLM Tool Selection Enhancing, https://arxiv.org/abs/2510.17843
Jason Tsay, Zidane Wright, Gaodan Fang, Kiran Kate, Saurabh Jha, Yara Rizk, 17 Oct 2025, Repairing Tool Calls Using Post-tool Execution Reflection and RAG, https://arxiv.org/abs/2510.17874
Monika Zamojska and Jaros{\l}aw A. Chudziak, 19 Oct 2025, TACLA: An LLM-Based Multi-Agent Tool for Transactional Analysis Training in Education, https://arxiv.org/abs/2510.17913
Juhyeong Kim, Yejin Kim, Youngbin Lee and Hyunwoo Byun, 21 Oct 2025, FinAI Data Assistant: LLM-based Financial Database Query Processing with the OpenAI Function Calling API, https://arxiv.org/abs/2510.14162
Kaiwen He, Zhiwei Wang, Chenyi Zhuang, Jinjie Gu, 25 Sep 2025, Recon-Act: A Self-Evolving Multi-Agent Browser-Use System via Web Reconnaissance, Tool Generation, and Task Execution, https://arxiv.org/abs/2509.21072
Nishant Gaurav, Adit Akarsh, Ankit Ranjan, Manoj Bajaj, 22 Sep 2025, Dynamic ReAct: Scalable Tool Selection for Large-Scale MCP Environments, https://arxiv.org/abs/2509.20386
Caleb DeLeeuw, Gaurav Chawla, Aniket Sharma, Vanessa Dietze, 23 Sep 2025, The Secret Agenda: LLMs Strategically Lie and Our Current Safety Tools Are Blind, https://arxiv.org/abs/2509.20393
Ping He, Changjiang Li, Binbin Zhao, Tianyu Du, Shouling Ji, 25 Sep 2025, Automatic Red Teaming LLM-based Agents with Model Context Protocol Tools, https://arxiv.org/abs/2509.21011
Junhao Su, Yuanliang Wan, Junwei Yang, Hengyu Shi, Tianyang Han, Junfeng Luo, Yurui Qiu, 25 Sep 2025, Failure Makes the Agent Stronger: Enhancing Accuracy through Structured Reflection for Reliable Tool Interactions, https://arxiv.org/abs/2509.18847
Yifei Chen, Guanting Dong, Zhicheng Dou, 27 Sep 2025, Toward Effective Tool-Integrated Reasoning via Self-Evolved Preference Learning, https://arxiv.org/abs/2509.23285
Ningning Xu, Yuxuan Jiang, Shubhashis Roy Dipta, 27 Sep 2025, Learning How to Use Tools, Not Just When: Pattern-Aware Tool-Integrated Reasoning, https://arxiv.org/abs/2509.23292
Gyubok Lee, Woosog Chay, Heeyoung Kwak, Yeong Hwa Kim, Haanju Yoo, Oksoon Jeong, Meong Hi Son, Edward Choi, 27 Sep 2025, From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database Agents, https://arxiv.org/abs/2509.23415
Pavan C Shekar, Ashwanth Krishnan, 17 Oct 2025, Adaptive Minds: Empowering Agents with LoRA-as-Tools, https://arxiv.org/abs/2510.15416
Pengfei He, Zhenwei Dai, Bing He, Hui Liu, Xianfeng Tang, Hanqing Lu, Juanhui Li, Jiayuan Ding, Subhabrata Mukherjee, Suhang Wang, Yue Xing, Jiliang Tang, Benoit Dumoulin, 6 Oct 2025, TRAJECT-Bench:A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use, https://arxiv.org/abs/2510.04550
C. Coelho, M. Hohmann, D. Fern\'andez, L. Penter, S. Ihlenfeldt, O. Niggemann, 26 Sep 2025, Data-Driven Temperature Modelling of Machine Tools by Neural Networks: A Benchmark, https://arxiv.org/abs/2510.03261
Jehyeok Yeon, Isha Chaudhary, Gagandeep Singh, 5 Oct 2025, Quantifying Distributional Robustness of Agentic Tool-Selection, https://arxiv.org/abs/2510.03992
Ankit Vadehra, Bill Johnson, Gene Saunders, Pascal Poupart, 5 Oct 2025, Time Is Effort: Estimating Human Post-Editing Time for Grammar Error Correction Tool Evaluation, https://arxiv.org/abs/2510.04394
Ha Min Son, Huan Ren, Xin Liu, Zhe Zhao, 9 Oct 2025, Automating Android Build Repair: Bridging the Reasoning-Execution Gap in LLM Agents with Domain-Specific Tools, https://arxiv.org/abs/2510.08640
Daphne Theodorakopoulos, Elisabeth Eberling, Miriam Bodenheimer, Sabine Loos and Frederic Stahl, 24 Oct 2025, FITS: Towards an AI-Driven Fashion Information Tool for Sustainability, https://arxiv.org/abs/2509.26017
Amartya Chakraborty, Paresh Dashore, Nadia Bathaee, Anmol Jain, Anirban Das, Shi-Xiong Zhang, Sambit Sahu, Milind Naphade, Genta Indra Winata, 23 Oct 2025, T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning, https://arxiv.org/abs/2505.16986
Brij Bidhin Desai, Yukta Arvind Rajapur, Aswathi Mundayatt and Jaya Sreevalsan-Nair, 24 Oct 2025, CityAQVis: Integrated ML-Visualization Sandbox Tool for Pollutant Estimation in Urban Regions Using Multi-Source Data (Software Article), https://arxiv.org/abs/2510.18878
SHengjie Ma, Chenlong Deng, Jiaxin Mao, Jiadeng Huang, Teng Wang, Junjie Wu, Changwang Zhang, Jun wang, 13 Oct 2025, PoU: Proof-of-Use to Counter Tool-Call Hacking in DeepResearch Agents, https://arxiv.org/abs/2510.10931
Kartikeya Aneja and Nagender Aneja and Murat Kantarcioglu, 11 Oct 2025, Learning Joint Embeddings of Function and Process Call Graphs for Malware Detection, https://arxiv.org/abs/2510.09984
Zhengyu Chen, Jinluan Yang, Teng Xiao, Ruochen Zhou, Luan Zhang, Xiangyu Xi, Xiaowei Shi, Wei Wang, Jinggang Wang, 13 Oct 2025, Can Tool-Integrated Reinforcement Learning Generalize Across Diverse Domains?, https://arxiv.org/abs/2510.11184
Kuan-Yi Lee, Tsung-En Lin, Hung-Yi Lee, 13 Oct 2025, Audio-Maestro: Enhancing Large Audio-Language Models with Tool-Augmented Reasoning, https://arxiv.org/abs/2510.11454
Manaal Basha, Aime\^e M. Ribeiro, Jeena Javahar, Cleidson R. B. de Souza, Gema Rodr\'iguez-P\'erez, 13 Oct 2025, CodeWatcher: IDE Telemetry Data Extraction Tool for Understanding Coding Interactions with LLMs, https://arxiv.org/abs/2510.11536
Xuanqi Gao, Siyi Xie, Juan Zhai, Shiqing Ma, Chao Shen, 12 Oct 2025, MCP-RADAR: A Multi-Dimensional Benchmark for Evaluating Tool Use Capabilities in Large Language Models, https://arxiv.org/abs/2505.16700
Murong Yue, Zhiwei Liu, Liangwei Yang, Jianguo Zhang, Zuxin Liu, Haolin Chen, Ziyu Yao, Silvio Savarese, Caiming Xiong, Shelby Heinecke, Huan Wang, 9 Oct 2025, ToolLibGen: Scalable Automatic Tool Creation and Aggregation for LLM Reasoning, https://arxiv.org/abs/2510.07768
Fu Chen, Peng Wang, Xiyin Li, Wen Li, Shichi Lei, Dongdong Xiang, 9 Oct 2025, ToolExpander: Extending the Frontiers of Tool-Using Reinforcement Learning to Weak LLMs, https://arxiv.org/abs/2510.07737
Nikolai Skripko, 22 Sep 2025, Instruction-Following Evaluation in Function Calling for Large Language Models, https://arxiv.org/abs/2509.18420
Jia-Kai Dong, I-Wei Huang, Chun-Tin Wu, Yi-Tien Tsai, 22 Oct 2025, MSC-Bench: A Rigorous Benchmark for Multi-Server Tool Orchestration, https://arxiv.org/abs/2510.19423
Irene Testini and Jos\'e Hern\'andez-Orallo and Lorenzo Pacchiardi, 22 Oct 2025, Measuring Data Science Automation: A Survey of Evaluation Tools for AI Assistants and Agents, https://arxiv.org/abs/2506.08800
Junhyeong Lee, Joon-Young Kim, Heekyu Kim, Inhyo Lee and Seunghwa Ryu, 22 Oct 2025, IM-Chat: A Multi-agent LLM Framework Integrating Tool-Calling and Diffusion Modeling for Knowledge Transfer in Injection Molding Industry, https://arxiv.org/abs/2507.15268
Sri Vatsa Vuddanti, Aarav Shah, Satwik Kumar Chittiprolu, Tony Song, Sunishchal Dev, Kevin Zhu, Maheep Chaudhary, 25 Sep 2025, PALADIN: Self-Correcting Language Model Agents to Cure Tool-Failure Cases, https://arxiv.org/abs/2509.25238
Jing-Jing Li, Jianfeng He, Chao Shang, Devang Kulshreshtha, Xun Xian, Yi Zhang, Hang Su, Sandesh Swamy, Yanjun Qi, 30 Sep 2025, STAC: When Innocent Tools Form Dangerous Chains to Jailbreak LLM Agents, https://arxiv.org/abs/2509.25624
Assem Omar, Youssef Omar, Marwa Solayman, Hesham Mansour, 30 Sep 2025, Comparative Analysis of Ant Colony Optimization and Google OR-Tools for Solving the Open Capacitated Vehicle Routing Problem in Logistics, https://arxiv.org/abs/2509.26216
Xiao Zhang, Qi Wang, Mingyi Li, Yuan Yuan, Mengbai Xiao, Fuzhen Zhuang, and Dongxiao Yu, 30 Sep 2025, TAMO: Fine-Grained Root Cause Analysis via Tool-Assisted LLM Agent with Multi-Modality Observation Data in Cloud-Native Systems, https://arxiv.org/abs/2504.20462
Zhuofeng Li, Haoxiang Zhang, Seungju Han, Sheng Liu, Jianwen Xie, Yu Zhang, Yejin Choi, James Zou, and Pan Lu, 7 Oct 2025, In-the-Flow Agentic System Optimization for Effective Planning and Tool Use, https://arxiv.org/abs/2510.05592
Jiaru Zou, Soumya Roy, Vinay Kumar Verma, Ziyi Wang, David Wipf, Pan Lu, Sumit Negi, James Zou, Jingrui He, 7 Oct 2025, TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning, https://arxiv.org/abs/2510.06217
Jianghao Lin, Yuanyuan Shi, Xin Peng, Renjie Ding, Hairui Wang, Yuxuan Peng, Bizhe Bai, Weixi Song, Fengshuo Bai, Huacan Chai, Weinan Zhang, Fei Huang, Ying Wen, 16 Oct 2025, ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling, https://arxiv.org/abs/2510.14703
Eran Malach, Omid Saremi, Sinead Williamson, Arwen Bradley, Aryo Lotfi, Emmanuel Abbe, Josh Susskind, Etai Littwin, 16 Oct 2025, To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models, https://arxiv.org/abs/2510.14826
Jiawei Xu, Chia Xin Liang, Ziqian Bi, Xiaoming Li, Danyang Zhang, Zhenyu Yu, Dec 2025, A Comprehensive Survey on Large Language Models: From Pre-training to Autonomous Agents, https://www.researchgate.net/profile/Ziqian_Bi/publication/399059225_A_Comprehensive_Survey_on_Large_Language_Models_From_Pre-training_to_Autonomous_Agents/links/694c94a07e61d05b5312836f/A-Comprehensive-Survey-on-Large-Language-Models-From-Pre-training-to-Autonomous-Agents.pdf
AI News, Mar 05, 2026, Is Harness Engineering real? https://www.latent.space/p/ainews-is-harness-engineering-real