Aussie AI

Large Reasoning Models

  • Last Updated 17 November, 2025
  • by David Spuler, Ph.D.

Research on Large Reasoning Models

Research papers include:

  • Ignacio de Gregorio, Dec 2024, Uncovering OpenAI’s Frontier AI Strategy, https://medium.com/@ignacio.de.gregorio.noblejas/uncovering-openais-frontier-ai-strategy-a02e0aa5320e
  • Xiaoxi Li, Guanting Dong, Jiajie Jin, Yuyao Zhang, Yujia Zhou, Yutao Zhu, Peitian Zhang, Zhicheng Dou, 9 Jan 2025, Search-o1: Agentic Search-Enhanced Large Reasoning Models, https://arxiv.org/abs/2501.05366 https://github.com/sunnynexus/Search-o1 (RAG retrieval and agentic methods applied to Large Reasoning Models.)
  • Fengli Xu, Qianyue Hao, Zefang Zong, Jingwei Wang, Yunke Zhang, Jingyi Wang, Xiaochong Lan, Jiahui Gong, Tianjian Ouyang, Fanjin Meng, Chenyang Shao, Yuwei Yan, Qinglong Yang, Yiwen Song, Sijian Ren, Xinyuan Hu, Yu Li, Jie Feng, Chen Gao, Yong Li, 17 Jan 2025 (v2), Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities, https://arxiv.org/abs/2501.09686
  • OpenAI, September 12, 2024 Learning to reason with LLMs. We are introducing OpenAI o1, a new large language model trained with reinforcement learning to perform complex reasoning. o1 thinks before it answers—it can produce a long internal chain of thought before responding to the user. https://openai.com/index/learning-to-reason-with-llms/
  • Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Back, 16 Jul 2024, Reasoning with Large Language Models, a Survey, https://arxiv.org/abs/2407.11511
  • Jie Huang and Kevin Chen-Chuan Chang. July 2023. Towards Reasoning in Large Language Models: A Survey. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1049–1065, Toronto, Canada. Association for Computational Linguistics. https://aclanthology.org/2023.findings-acl.67/
  • Seungpil Lee, Woochang Sim, Donghyeon Shin, Wongyu Seo, Jiwon Park, Seokki Lee, Sanha Hwang, Sejin Kim, and Sundong Kim. Jan 2025. Reasoning Abilities of Large Language Models: In-Depth Analysis on the Abstraction and Reasoning Corpus. ACM Trans. Intell. Syst. Technol. https://doi.org/10.1145/3712701 https://dl.acm.org/doi/10.1145/3712701 https://dl.acm.org/doi/pdf/10.1145/3712701
  • Demis Hassabis, Jan 2025, X post: Announcing Gemini 2.0 Flash https://x.com/demishassabis/status/1881844417746632910 (Gemini 2.0 Flash from Google is a Large Reasoning Model with a 1M ultra-long context.)
  • Maciej Besta, Julia Barth, Eric Schreiber, Ales Kubicek, Afonso Catarino, Robert Gerstenberger, Piotr Nyczyk, Patrick Iff, Yueling Li, Sam Houliston, Tomasz Sternal, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Łukasz Flis, Hannes Eberhard, Hubert Niewiadomski, Torsten Hoefler, 23 Jan 2025 (v3), Reasoning Language Models: A Blueprint, https://arxiv.org/abs/2501.11223 (Survey and blueprint for how to build a Large Reasoning Model.)
  • Alberto Romero, Jan 2025, DeepSeek, a little-known Chinese startup, released R1 yesterday, https://substack.com/@thealgorithmicbridge/note/c-87664591-
  • DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, Xiaokang Zhang, Xingkai Yu, Yu Wu, Z.F. Wu, Zhibin Gou, Zhihong Shao, Zhuoshu Li, Ziyi Gao, Aixin Liu, Bing Xue, Bingxuan Wang, Bochao Wu, Bei Feng, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai Dai, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fucong Dai, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Han Bao, Hanwei Xu, Haocheng Wang, Honghui Ding, Huajian Xin, Huazuo Gao, Hui Qu, Hui Li, Jianzhong Guo, et al. (100+ additional authors not shown), 22 Jan 2025, DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning, https://arxiv.org/abs/2501.12948 (The DeepSeek R1 large reasoning model.)
  • G Wang, S Zhang, T Zhan, Z Shen, J Li, X Hu, X Sun, Jan 2025, Unlocking the Mysteries of OpenAI o1: A Survey of the Reasoning Abilities of Large Language Models, https://openreview.net/pdf?id=J0ADLa2rNp
  • Ben Dickson, January 31, 2025, Beyond benchmarks: How DeepSeek-R1 and o1 perform on real-world tasks, https://venturebeat.com/ai/beyond-benchmarks-how-deepseek-r1-and-o1-perform-on-real-world-tasks/
  • Deqian Kong, Minglu Zhao, Dehong Xu, Bo Pang, Shu Wang, Edouardo Honig, Zhangzhang Si, Chuan Li, Jianwen Xie, Sirui Xie, Ying Nian Wu, 3 Feb 2025, Scalable Language Models with Posterior Inference of Latent Thought Vectors, https://arxiv.org/abs/2502.01567
  • Ahmed El-Kishky, Alexander Wei, Andre Saraiva, Borys Minaev, Daniel Selsam, David Dohan, Francis Song, Hunter Lightman, Ignasi Clavera, Jakub Pachocki, Jerry Tworek, Lorenz Kuhn, Lukasz Kaiser, Mark Chen, Max Schwarzer, Mostafa Rohaninejad, Nat McAleese, o3 contributors, Oleg Mürk, Rhythm Garg, Rui Shu, Szymon Sidor, Vineet Kosaraju, Wenda Zhou, 3 Feb 2025, Competitive Programming with Large Reasoning Models, https://arxiv.org/abs/2502.06807 (OpenAI's paper on o3 that has similar conclusions to what DeepSeek showed about Reinforcement Learning for reasoning models, namely that "scaling general-purpose reinforcement learning" still works.)
  • DiJia Su, Hanlin Zhu, Yingchen Xu, Jiantao Jiao, Yuandong Tian, Qinqing Zheng, 5 Feb 2025. Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning, https://arxiv.org/abs/2502.03275
  • Cameron R. Wolfe, Feb 18, 2025, Demystifying Reasoning Models: Understanding reasoning models and their relation to standard LLMs... https://cameronrwolfe.substack.com/p/demystifying-reasoning-models
  • Jeremy Kahn, February 28, 2025, OpenAI launches long-awaited GPT-4.5 — but ‘Orion’s’ capabilities already lag competitors, https://fortune.com/2025/02/27/openai-gpt-4-5-orion-launch-sam-altman-benchmarks/
  • Komal Kumar, Tajamul Ashraf, Omkar Thawakar, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Phillip H.S. Torr, Salman Khan, Fahad Shahbaz Khan, 28 Feb 2025, LLM Post-Training: A Deep Dive into Reasoning Large Language Models, https://arxiv.org/abs/2502.21321 https://github.com/mbzuai-oryx/Awesome-LLM-Post-training
  • Asif Razzaq, March 5, 2025, Qwen Releases QwQ-32B: A 32B Reasoning Model that Achieves Significantly Enhanced Performance in Downstream Task, https://www.marktechpost.com/2025/03/05/qwen-releases-qwq-32b-a-32b-reasoning-model-that-achieves-significantly-enhanced-performance-in-downstream-task/ (Features 32B parameters, 32K context length, 64 layers, RoPE, SwiGLU, RMSNorm, and attention enhancements.)
  • Parshin Shojaee, Maxwell Horton, Iman Mirzadeh, Samy Bengio, Keivan Alizadeh, June 2025, The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity, Apple, https://machinelearning.apple.com/research/illusion-of-thinking https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf
  • Dr. Ashish Bamania, June 2025, Apple’s New Research Shows That LLM Reasoning Is Completely Broken: A deep dive into Apple research that exposes the flawed thinking process in state-of-the-art Reasoning LLMs, https://ai.gopubby.com/apples-new-research-shows-that-llm-reasoning-is-completely-broken-47b5be71a06a
  • Ryan Browne, Jun 10 2025, Microsoft-backed AI lab Mistral is launching its first reasoning model in challenge to OpenAI, https://www.cnbc.com/2025/06/10/microsoft-backed-ai-lab-mistral-debuts-reasoning-model-to-rival-openai.html (Mistral's new LRM has multilingual reasoning.)
  • Bowen Ding, Yuhan Chen, Futing Wang, Lingfeng Ming, Tao Lin, 30 Jun 2025, Do Thinking Tokens Help or Trap? Towards More Efficient Large Reasoning Model, https://arxiv.org/abs/2506.23840
  • Bin Hong, Jiayu Liu, Zhenya Huang, Kai Zhang, Mengdi Zhang, 13 Aug 2025, Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization, https://arxiv.org/abs/2508.10164
  • Zhipeng Chen, Xiaobo Qin, Youbin Wu, Yue Ling, Qinghao Ye, Wayne Xin Zhao, Guang Shi, 14 Aug 2025, Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models, https://arxiv.org/abs/2508.10751
  • Datta Nimmaturi, Vaishnavi Bhargava, Rajat Ghosh, Johnu George, Debojyoti Dutta, 24 Jul 2025, Predictive Scaling Laws for Efficient GRPO Training of Large Reasoning Models, https://arxiv.org/abs/2507.18014
  • Kaiwen Chen, Xin Tan, Minchen Yu, Hong Xu, 29 Jul 2025, MemShare: Memory Efficient Inference for Large Reasoning Models through KV Cache Reuse, https://arxiv.org/abs/2507.21433
  • Tao He, Rongchuan Mu, Lizi Liao, Yixin Cao, Ming Liu, and Bing Qin, 31 Jul 2025, Good Learners Think Their Thinking: Generative PRM Makes Large Reasoning Model More Efficient Math Learner, https://arxiv.org/abs/2507.23317
  • Dadi Guo, Jiayu Liu, Zhiyuan Fan, Zhitao He, Haoran Li, Yumeng Wang, Yi R. Fung, 31 Jul 2025, Mathematical Proof as a Litmus Test: Revealing Failure Modes of Advanced Large Reasoning Models, https://arxiv.org/abs/2506.17114
  • Linan Yue, Yichao Du, Yizhi Wang, Weibo Gao, Fangzhou Yao, Li Wang, Ye Liu, Ziyu Xu, Qi Liu, Shimin Di, Min-Ling Zhang, 4 Aug 2025, Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models, https://arxiv.org/abs/2508.02120
  • Yule Liu, Jingyi Zheng, Zhen Sun, Zifan Peng, Wenhan Dong, Zeyang Sha, Shiwen Cui, Weiqiang Wang, Xinlei He, 4 Aug 2025, Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models, https://arxiv.org/abs/2504.13626
  • Junhong Wu, Jinliang Lu, Zixuan Ren, Ganqiang Hu, Zhi Wu, Dai Dai, Hua Wu, 5 Aug 2025, LLMs Have a Heart of Stone: Demystifying the Soft Thinking Ability of Large Reasoning Models, https://arxiv.org/abs/2508.03440
  • Yuan Xun, Xiaojun Jia, Xinwei Liu, Hua Zhang, 6 Aug 2025, The Emotional Baby Is Truly Deadly: Does your Multimodal Large Reasoning Model Have Emotional Flattery towards Humans?, https://arxiv.org/abs/2508.03986
  • Rui Ha, Chaozhuo Li, Rui Pu, Sen Su, 6 Aug 2025, From "Aha Moments" to Controllable Thinking: Toward Meta-Cognitive Reasoning in Large Reasoning Models via Decoupled Reasoning and Control, https://arxiv.org/abs/2508.04460
  • Thilo Hagendorff, Erik Derner, Nuria Oliver, 4 Aug 2025, Large Reasoning Models Are Autonomous Jailbreak Agents, https://arxiv.org/abs/2508.04039
  • Yuquan Wang, Mi Zhang, Yining Wang, Geng Hong, Xiaoyu You, Min Yang, 6 Aug 2025, ReasoningGuard: Safeguarding Large Reasoning Models with Inference-time Safety Aha Moments, https://arxiv.org/abs/2508.04204
  • Yongjiang Liu, Haoxi Li, Xiaosong Ma, Jie Zhang, Song Guo, 6 Aug 2025, Think How to Think: Mitigating Overthinking with Autonomous Difficulty Cognition in Large Reasoning Models, https://arxiv.org/abs/2507.02663
  • Youcheng Huang, Bowen Qin, Chen Huang, Duanyu Feng, Xi Yang, Wenqiang Lei, 15 Aug 2025, Beyond Solving Math Quiz: Evaluating the Ability of Large Reasoning Models to Ask for Information, https://arxiv.org/abs/2508.11252
  • Nuo Chen, Zhiyuan Hu, Qingyun Zou, Jiaying Wu, Qian Wang, Bryan Hooi, Bingsheng He, 20 Aug 2025, JudgeLRM: Large Reasoning Models as a Judge, https://arxiv.org/abs/2504.00050
  • Haonan Dong, Haoran Ye, Wenhao Zhu, Kehan Jiang, Guojie Song, 24 Aug 2025, Meta-R1: Empowering Large Reasoning Models with Metacognition, https://arxiv.org/abs/2508.17291
  • Yi Liu and Xiangyu Liu and Zequn Sun and Wei Hu, 26 Aug 2025, Answering the Unanswerable Is to Err Knowingly: Analyzing and Mitigating Abstention Failures in Large Reasoning Models, https://arxiv.org/abs/2508.18760
  • Microsoft, 17 Sep, 2025, GPT-5 vs GPT-4.1: choosing the right model for your use case https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-models/how-to/model-choice-guide
  • Zhengxiang Cheng, Dongping Chen, Mingyang Fu, Tianyi Zhou, 11 Sep 2025, Optimizing Length Compression in Large Reasoning Models, https://arxiv.org/abs/2506.14755
  • Kaiyan Zhang, Yuxin Zuo, Bingxiang He, Youbang Sun, Runze Liu, Che Jiang, Yuchen Fan, Kai Tian, Guoli Jia, Pengfei Li, Yu Fu, Xingtai Lv, Yuchen Zhang, Sihang Zeng, Shang Qu, Haozhan Li, Shijie Wang, Yuru Wang, Xinwei Long, Fangfu Liu, Xiang Xu, Jiaze Ma, Xuekai Zhu, Ermo Hua, Yihao Liu, Zonglin Li, Huayu Chen, Xiaoye Qu, Yafu Li, Weize Chen, Zhenzhao Yuan, Junqi Gao, Dong Li, Zhiyuan Ma, Ganqu Cui, Zhiyuan Liu, Biqing Qi, Ning Ding, Bowen Zhou, 18 Sep 2025, A Survey of Reinforcement Learning for Large Reasoning Models, https://arxiv.org/abs/2509.08827
  • Yanlong Wang, Jian Xu, Fei Ma, Hongkang Zhang, Hang Yu, Tiantian Gao, Yu Wang, Haochen You, Shao-Lun Huang, Danny Dongning Sun, Xiao-Ping Zhang, 10 Sep 2025, FinZero: Launching Multi-modal Financial Time Series Forecast with Large Reasoning Model, https://arxiv.org/abs/2509.08742
  • Nan Zhang, Eugene Kwek, Yusen Zhang, Ngoc-Hieu Nguyen, Prasenjit Mitra, Rui Zhang, 2 Oct 2025, When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models, https://arxiv.org/abs/2504.02010
  • Jingcong Liang, Shijun Wan, Xuehai Wu, Siyuan Wang, Yitong Li, Qianglong Chen, Duyu Tang, Zhongyu Wei, 14 Oct 2025, HardcoreLogic: Challenging Large Reasoning Models with Long-tail Logic Puzzle Games, https://arxiv.org/abs/2510.12563
  • Yujian Zhang, Keyu Chen, Zhifeng Shen, Ruizhi Qiao, Xing Sun, 14 Oct 2025, Adaptive Dual Reasoner: Large Reasoning Models Can Think Efficiently by Hybrid Reasoning, https://arxiv.org/abs/2510.10207
  • Bowen Qin, Chen Yue, Fang Yin, Hui Wang, JG Yao, Jiakang Liu, Jing-Shu Zheng, Miguel Hu Chen, Richeng Xuan, Shibei Meng, Shiqi Zhou, Teng Dai, Tong-Shuai Ren, Wei Cui, Xi Yang, Xialin Du, Xiaojing Xu, Xue Sun, Xuejing Li, Yaming Liu, Yesheng Liu, Ying Liu, Yonghua Lin, Yu Zhao, Yunduo Zhang, Yuwen Luo, Zheqi He, Zhiyuan He, Zhongyuan Wang, 14 Oct 2025, FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions, https://arxiv.org/abs/2509.17177
  • Tsung-Han Wu, Mihran Miroyan, David M. Chan, Trevor Darrell, Narges Norouzi, Joseph E. Gonzalez, 14 Oct 2025, Are Large Reasoning Models Interruptible?, https://arxiv.org/abs/2510.11713
  • ShengYun Peng, Eric Smith, Ivan Evtimov, Song Jiang, Pin-Yu Chen, Hongyuan Zhan, Haozhu Wang, Duen Horng Chau, Mahesh Pasupuleti, Jianfeng Chi, 1 Oct 2025, Large Reasoning Models Learn Better Alignment from Flawed Thinking, https://arxiv.org/abs/2510.00938
  • Gouki Minegishi, Hiroki Furuta, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo, 1 Oct 2025, Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties, https://arxiv.org/abs/2506.05744
  • Hans Peter Lynsg{\o}e Raaschou-jensen and Constanza Fierro and Anders S{\o}gaard, 1 Oct 2025, Towards a Progress Bar for Reasoning: Progress Prediction in Large Reasoning Models, https://arxiv.org/abs/2506.23274
  • Tommaso Green, Martin Gubri, Haritz Puerto, Sangdoo Yun, Seong Joon Oh, 1 Oct 2025, Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers, https://arxiv.org/abs/2506.15674
  • Runzhe Zhan, Zhihong Huang, Xinyi Yang, Lidia S. Chao, Min Yang, Derek F. Wong, 23 Oct 2025, Are Large Reasoning Models Good Translation Evaluators? Analysis and Performance Boost, https://arxiv.org/abs/2510.20780
  • Zhehao Zhang, Weijie Xu, Shixian Cui, Chandan K. Reddy, 17 Oct 2025, Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense, https://arxiv.org/abs/2510.16259
  • Mohan Zhang, Yihua Zhang, Jinghan Jia, Zhangyang Wang, Sijia Liu, Tianlong Chen, 12 Oct 2025, One Token Embedding Is Enough to Deadlock Your Large Reasoning Model, https://arxiv.org/abs/2510.15965
  • Giacomo Camposampiero, Michael Hersche, Roger Wattenhofer, Abu Sebastian, Abbas Rahimi, 20 Oct 2025, I-RAVEN-X: Benchmarking Generalization and Robustness of Analogical and Mathematical Reasoning in Large Language and Reasoning Models, https://arxiv.org/abs/2510.17496
  • G M Shahariar, Ali Nazari, Erfan Shayegani, Nael Abu-Ghazaleh, 25 Oct 2025, Modeling Hierarchical Thinking in Large Reasoning Models, https://arxiv.org/abs/2510.22437
  • Hoang Phan, Xianjun Yang, Kevin Yao, Jingyu Zhang, Shengjie Bi, Xiaocheng Tang, Madian Khabsa, Lijuan Liu, Deren Lei, 24 Oct 2025, Beyond Reasoning Gains: Mitigating General Capabilities Forgetting in Large Reasoning Models, https://arxiv.org/abs/2510.21978
  • Changyi Li, Jiayi Wang, Xudong Pan, Geng Hong, Min Yang, 15 Oct 2025, ReasoningShield: Safety Detection over Reasoning Traces of Large Reasoning Models, https://arxiv.org/abs/2505.17244
  • Rubing Yang, Huajun Bai, Song Liu, Guanghua Yu, Runzhi Fan, Yanbin Dang, Jiejing Zhang, Kai Liu, Jianchen Zhu, Peng Chen, 21 Oct 2025, SpecExit: Accelerating Large Reasoning Model via Speculative Exit, https://arxiv.org/abs/2509.24248
  • Yi Lu, Jianing Wang, Linsen Guo, Wei He, Hongyin Tang, Tao Gui, Xuanjing Huang, Xuezhi Cao, Wei Wang, Xunliang Cai, 21 Oct 2025, R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?, https://arxiv.org/abs/2510.08189
  • Junjie Zhang, Guozheng Ma, Shunyu Liu, Haoyu Wang, Jiaxing Huang, Ting-En Lin, Fei Huang, Yongbin Li, Dacheng Tao, 25 Sep 2025, A Simple "Motivation" Can Enhance Reinforcement Finetuning of Large Reasoning Models, https://arxiv.org/abs/2506.18485
  • Jinyi Han, Ying Huang, Ying Liao, Zishang Jiang, Xikun Lu, Haiquan Zhao, Xinyi Wang, Guanghao Zhou, Sihang Jiang, Jiaqing Liang, Weikang Zhou, Zeye Sun, Fei Yu, Yanghua Xiao, 27 Sep 2025, Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking, https://arxiv.org/abs/2509.23392
  • Guanxu Chen, Yafu Li, Yuxian Jiang, Chen Qian, Qihan Ren, Jingyi Yang, Yu Cheng, Dongrui Liu, Jing Shao, 28 Sep 2025, Conditional Advantage Estimation for Reinforcement Learning in Large Reasoning Models, https://arxiv.org/abs/2509.23962
  • Yuhui Wang, Changjiang Li, Guangke Chen, Jiacheng Liang, Ting Wang, 29 Sep 2025, Reasoning or Retrieval? A Study of Answer Attribution on Large Reasoning Models, https://arxiv.org/abs/2509.24156
  • Zihao Zhu, Xinyu Wu, Gehan Hu, Siwei Lyu, Ke Xu, Baoyuan Wu, 29 Sep 2025, AdvChain: Adversarial Chain-of-Thought Tuning for Robust Safety Alignment of Large Reasoning Models, https://arxiv.org/abs/2509.24269
  • Yichi Zhang, Yue Ding, Jingwen Yang, Tianwei Luo, Dongbai Li, Ranjie Duan, Qiang Liu, Hang Su, Yinpeng Dong, Jun Zhu, 29 Sep 2025, Towards Safe Reasoning in Large Reasoning Models via Corrective Intervention, https://arxiv.org/abs/2509.24393
  • Qingjie Zhang, Yujia Fu, Yang Wang, Liu Yan, Tao Wei, Ke Xu, Minlie Huang, Han Qiu, 29 Sep 2025, On the Self-awareness of Large Reasoning Models' Capability Boundaries, https://arxiv.org/abs/2509.24711
  • Yuyang Sha, Hongxin Pan, Gang Luo, Caijuan Shi, Jing Wang, Kefeng Li, 29 Sep 2025, MDD-Thinker: Towards Large Reasoning Models for Major Depressive Disorder Diagnosis, https://arxiv.org/abs/2509.24217
  • Yuxian Jiang, Yafu Li, Guanxu Chen, Dongrui Liu, Yu Cheng, Jing Shao, 29 Sep 2025, Rethinking Entropy Regularization in Large Reasoning Models, https://arxiv.org/abs/2509.25133
  • Yongchan Kwon, Shang Zhu, Federico Bianchi, Kaitlyn Zhou, James Zou, 17 Oct 2025, ReasonIF: Large Reasoning Models Fail to Follow Instructions During Reasoning, https://arxiv.org/abs/2510.15211
  • Mingkang Zhu, Xi Chen, Bei Yu, Hengshuang Zhao, Jiaya Jia, 6 Oct 2025, From Noisy Traces to Stable Gradients: Bias-Variance Optimized Preference Optimization for Aligning Large Reasoning Models, https://arxiv.org/abs/2510.05095
  • Yingzhi Mao (1 and 2), Chunkang Zhang (1 and 2), Junxiang Wang (1), Xinyan Guan (1 and 2), Boxi Cao (1), Yaojie Lu (1), Hongyu Lin (1), Xianpei Han (1 and 2), Le Sun (1 and 2) ((1) Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences, (2) University of Chinese Academy of Sciences), 24 Oct 2025, When Models Outthink Their Safety: Mitigating Self-Jailbreak in Large Reasoning Models with Chain-of-Guardrails, https://arxiv.org/abs/2510.21285
  • Xiaoxi Li, Jiajie Jin, Guanting Dong, Hongjin Qian, Yongkang Wu, Ji-Rong Wen, Yutao Zhu, Zhicheng Dou, 13 Oct 2025, WebThinker: Empowering Large Reasoning Models with Deep Research Capability, https://arxiv.org/abs/2504.21776
  • Sheikh Shafayat, Fahim Tajwar, Ruslan Salakhutdinov, Jeff Schneider, Andrea Zanette, 8 Oct 2025, Can Large Reasoning Models Self-Train?, https://arxiv.org/abs/2505.21444
  • Adarsha Balaji and Le Chen and Rajeev Thakur and Franck Cappello and Sandeep Madireddy, 22 Sep 2025, Evaluating the Safety and Skill Reasoning of Large Reasoning Models Under Compute Constraints, https://arxiv.org/abs/2509.18382
  • Fr\'ed\'eric Berdoz, Luca A. Lanzend\"orfer, Nick Tuninga, Roger Wattenhofer, 30 Sep 2025, Text-to-Scene with Large Reasoning Models, https://arxiv.org/abs/2509.26091
  • Jiacheng Liang, Tanqiu Jiang, Yuhui Wang, Rongyi Zhu, Fenglong Ma, Ting Wang, 29 Sep 2025, AutoRAN: Automated Hijacking of Safety Reasoning in Large Reasoning Models, https://arxiv.org/abs/2505.10846
  • Gang Li, Ming Lin, Tomer Galanti, Zhengzhong Tu, Tianbao Yang, 30 Sep 2025, DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization, https://arxiv.org/abs/2505.12366
  • Zhangyue Yin, Qiushi Sun, Zhiyuan Zeng, Zhiyuan Yu, Qipeng Guo, Xuanjing Huang, Xipeng Qiu, 7 Oct 2025, ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models, https://arxiv.org/abs/2510.06014

AI Books from Aussie AI



The Sweetest Lesson: Your Brain Versus AI The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
  • Your brain is 50 times bigger than the best AI engines.
  • Truly intelligent AI will require more compute!
  • Another case of the bitter lesson?
  • Maybe it's the opposite of that: the sweetest lesson.

Get your copy from Amazon: The Sweetest Lesson



RAG Optimization RAG Optimization: Accurate and Efficient LLM Applications: new book on RAG architectures:
  • Smarter RAG
  • Faster RAG
  • Cheaper RAG
  • Agentic RAG
  • RAG reasoning

Get your copy from Amazon: RAG Optimization



Generative AI in C++ Generative AI Applications book:
  • Deciding on your AI project
  • Planning for success and safety
  • Designs and LLM architectures
  • Expediting development
  • Implementation and deployment

Get your copy from Amazon: Generative AI Applications



Generative AI in C++ Generative AI programming book:
  • Generative AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++



CUDA C++ Optimization CUDA C++ Optimization book:
  • Faster CUDA C++ kernels
  • Optimization tools & techniques
  • Compute optimization
  • Memory optimization

Get your copy from Amazon: CUDA C++ Optimization



CUDA C++ Optimization CUDA C++ Debugging book:
  • Debugging CUDA C++ kernels
  • Tools & techniques
  • Self-testing & reliability
  • Common GPU kernel bugs

Get your copy from Amazon: CUDA C++ Debugging

More AI Research Topics

Read more about: