Aussie AI
Small Reasoning Models
-
Last Updated 23 May, 2025
-
by David Spuler, Ph.D.
What are Small Reasoning Models?
Small reasoning models are the combination of reasoning techniques with small language models. Large reasoning models are very expensive to run and the goal is to reduce the cost via a smaller model, but with some loss of accuracy. Small models can be used for two types of reasoning methods: either single-step reasoning or multiple-step inference-based reasoning.
There are two basic approaches to create a Small Reasoning Model (SRM):
- Start with a Large Reasoning Model (LRM) and reduce its size, or
- Start with a small model and increase its reasoning capabilities.
Cutting down a Large Reasoning Model to a smaller one may involve:
- Model compression (e.g. quantization).
- Distillation focused on reasoning knowledge
In the cases of open-source Large Reasoning Models (e.g. DeepSeek R1), there have already been releases of smaller versions, especially quantized ones.
Adding reasoning capabilities to a small model is particularly interesting to the open-source models world. There are many very capable small models of different sizes, but not many are specifically focused on reasoning. Some ways to go about it include:
- Multi-step CoT algorithms wrapped around smaller base models.
- Improved training and fine-tuning of single-step reasoning techniques to enhance a small model.
- Combination of both approaches is also possible.
Research on Small Reasoning Models
Research papers include:
- Matthias Bastian, Oct 6, 2024, Study reveals major reasoning flaws in smaller AI language models, https://the-decoder.com/study-reveals-major-reasoning-flaws-in-smaller-ai-language-models/
- Shuyang Jiang, Yusheng Liao, Zhe Chen, Ya Zhang, Yanfeng Wang, Yu Wang, 21 Jan 2025, MedS3: Towards Medical Small Language Models with Self-Evolved Slow Thinking, https://arxiv.org/abs/2501.12051 https://github.com/pixas/medsss
- Maxwell Zeff, February 5, 2025, Researchers created an open rival to OpenAI’s o1 ‘reasoning’ model for under $50, https://techcrunch.com/2025/02/05/researchers-created-an-open-rival-to-openais-o1-reasoning-model-for-under-50/
- Kyle Wiggers, January 11, 2025, Researchers open source Sky-T1, a ‘reasoning’ AI model that can be trained for less than $450,https://techcrunch.com/2025/01/11/researchers-open-source-sky-t1-a-reasoning-ai-model-that-can-be-trained-for-less-than-450/
- Ben Dickson, February 20, 2025, How test-time scaling unlocks hidden reasoning abilities in small language models (and allows them to outperform LLMs), https://venturebeat.com/ai/how-test-time-scaling-unlocks-hidden-reasoning-abilities-in-small-language-models-and-allows-them-to-outperform-llms/
- Asif Razzaq, March 5, 2025, Qwen Releases QwQ-32B: A 32B Reasoning Model that Achieves Significantly Enhanced Performance in Downstream Task, https://www.marktechpost.com/2025/03/05/qwen-releases-qwq-32b-a-32b-reasoning-model-that-achieves-significantly-enhanced-performance-in-downstream-task/ (Features 32B parameters, 32K context length, 64 layers, RoPE, SwiGLU, RMSNorm, and attention enhancements.)
- Carl Franzen, March 5, 2025, New open-source math model Light-R1-32B surpasses equivalent DeepSeek performance with only $1000 in training costs, https://venturebeat.com/ai/new-open-source-math-model-light-r1-32b-surpasses-equivalent-deepseek-performance-with-only-1000-in-training-costs/
- X Zhang, F Zhang, C Du, C Du, T Pang, W Gao, M Lin, Mar 2025, LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation, https://openreview.net/pdf?id=DfgfGTfObm
- Xuechen Zhang, Zijian Huang, Chenshun Ni, Ziyang Xiong, Jiasi Chen, Samet Oymak, 14 May 2025 (v2), Making Small Language Models Efficient Reasoners: Intervention, Supervision, Reinforcement, https://arxiv.org/abs/2505.07961
- Xiaomi LLM-Core Team: Bingquan Xia, Bowen Shen, Cici, Dawei Zhu, Di Zhang, Gang Wang, Hailin Zhang, Huaqiu Liu, Jiebao Xiao, Jinhao Dong, Liang Zhao, Peidian Li, Peng Wang, Shihua Yu, Shimao Chen, Weikun Wang, Wenhan Ma, Xiangwei Deng, Yi Huang, Yifan Song, Zihan Jiang, Bowen Ye, Can Cai, Chenhong He, Dong Zhang, Duo Zhang, Guoan Wang, Hao Tian, Haochen Zhao, Heng Qu, Hongshen Xu, Jun Shi, Kainan Bao, QingKai Fang, Kang Zhou, Kangyang Zhou, Lei Li, Menghang Zhu, Nuo Chen, Qiantong Wang, Shaohui Liu, Shicheng Li, Shuhao Gu, Shuhuai Ren, Shuo Liu, Sirui Deng, Weiji Zhuang, Weiwei Lv, Wenyu Yang, Xin Zhang, Xing Yong, Xing Zhang, Xingchen Song, Xinzhe Xu, Xu Wang, Yihan Yan, Yu Tu, Yuanyuan Tian, Yudong Wang, Yue Yu, Zhenru Lin, Zhichao Song, Zihao Yue, Xiaomi, 12 May 2025, MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining, https://arxiv.org/abs/2505.07608
- Haoran Xu, Baolin Peng, Hany Awadalla, Dongdong Chen, Yen-Chun Chen, Mei Gao, Young Jin Kim, Yunsheng Li, Liliang Ren, Yelong Shen, Shuohang Wang, Weijian Xu, Jianfeng Gao, Weizhu Chen, 30 Apr 2025, Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math, https://arxiv.org/abs/2504.21233
Reasoning and CoT Efficiency Topics
Blog articles on reasoning efficiency:
More research information on general efficiency optimization techniques for reasoning models:
- Reasoning inference optimization (RIO)
- Chain-of-Thought (CoT) optimization
- Small Reasoning Models (SRMs)
- Adaptive Inference Time Compute
- Hybrid Reasoning Models
- Reasoning Tokens
Efficiency optimizations to Chain-of-Thought include:
- Hidden Token Chain-of-Thought (HCot)
- Continuous Chain-of-Thought (Coconut)
- Chain of Draft (CoD)
- CoT Reasoning Decoding
- Concise Chain-of-Thought
- CoT Token Reduction
- CoT Step Skipping
- CoT Early Stopping
- CoT Path Reduction
- Constrained Chain-of-Thought
More AI Research
Read more about: