Aussie AI

Open Source Models

Last Updated 17 November, 2025

by David Spuler, Ph.D.

There are many different AI models that have been open-sourced. In many cases, both the code for the inference algorithm and the model's weights are available. Some licenses have only minimal restrictions (e.g. MIT License, Apache License 2.0), whereas other model licenses restrict usage to research or non-commercial activities.

Research Papers on Open Source Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample, Meta AI, Feb 2023, LLaMA: Open and Efficient Foundation Language Models, https://arxiv.org/abs/2302.13971 (Meta's Llama version 1, research-licensed, not fully open-sourced.)
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, Thomas Scialom, Meta AI, July 2023, Llama 2: Open Foundation and Fine-Tuned Chat Models, https://arxiv.org/abs/2307.09288 (LLama version 2, open-sourced including commercial, with a non-standard model-specific license.)
MosaicML NLP Team, "Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs", May 2023, Mosaic ML Blog, https://www.mosaicml.com/blog/mpt-7b
Georgi Gerganov, Jun, 2023 Llama.cpp project, https://github.com/ggerganov/llama.cpp/
Almazrouei, Ebtesam and Alobeidli, Hamza and Alshamsi, Abdulaziz and Cappelli, Alessandro and Cojocaru, Ruxandra and Debbah, Merouane and Goffinet, Etienne and Heslow, Daniel and Launay, Julien and Malartic, Quentin and Noune, Badreddine and Pannier, Baptiste and Penedo, Guilherme, "Falcon-40B: an open large language model with state-of-the-art performance", 2023, Hugging Face repository. https://huggingface.co/tiiuae/falcon-40b
Guilherme Penedo and Quentin Malartic and Daniel Hesslow and Ruxandra Cojocaru and Alessandro Cappelli and Hamza Alobeidli and Baptiste Pannier and Ebtesam Almazrouei and Julien Launay, "The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only", June 2023, arXiv article https://arxiv.org/abs/2306.01116
Tasmia Ansari, UC Berkeley Releases Open LLaMA, an Open-Source Alternative to Meta’s LLaMA, May 2023, Analytics India Magazine https://analyticsindiamag.com/uc-berkeley-release-an-open-source-alternative-to-metas-llama/
Together Computer, "OpenChatKit: An Open Toolkit and Base Model for Dialogue-style Applications", March 2023, GitHub repository https://github.com/togethercomputer/OpenChatKit
BigScience, "BLOOM: A 176B-Parameter Open-Access Multilingual Language Model", June 2023, arXiv paper 2211.05100 https://arxiv.org/pdf/2211.05100.pdf
Nolan Dey, Gurpreet Gosal, Zhiming (Charles) Chen, Hemant Khachane, William Marshall, Ribhu Pathria, Marvin Tom, Joel Hestness, "Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster", April 2023, arXiv 2304.03208 https://arxiv.org/abs/2304.03208
Lianmin Zheng and Wei-Lin Chiang and Ying Sheng and Siyuan Zhuang and Zhanghao Wu and Yonghao Zhuang and Zi Lin and Zhuohan Li and Dacheng Li and Eric. P Xing and Hao Zhang and Joseph E. Gonzalez and Ion Stoica, "Judging LLM-as-a-judge with MT-Bench and Chatbot Arena", 2023, ArXiv paper 2306.05685 https://arxiv.org/abs/2306.05685
Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Yu, Joey Gonzalez, Hao Zhang, and Ion Stoica. June 20th, 2023, vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention, https://arxiv.org/pdf/2309.06180.pdf
Jeon, Byungsoo, May 2024, Automated and Portable Machine Learning Systems, Ph.D. Thesis, Carnegie Mellon University, https://doi.org/10.1184/R1/25746708.v1 https://kilthub.cmu.edu/articles/thesis/Automated_and_Portable_Machine_Learning_Systems/25746708/1 PDF: https://kilthub.cmu.edu/ndownloader/files/46074087 Code: https://github.com/cmu-catalyst/collage (Portability layer to integrate the various kernels and low-level backends more easily. Also covers pipeline parallelism in graph models, and KV cache parallelism similar to FlashDecode.)
Maria Korolov, 15 May 2024, 10 things to watch out for with open source gen AI, CIO, https://www.cio.com/article/2104280/10-things-to-watch-out-for-with-open-source-gen-ai.html
JH Jones, May 2024, A Quantitative Comparison of Pre-Trained Model Registries to Traditional Software Package Registries, Masters Thesis, Electrical and Computer Engineering, Purdue University, https://hammer.purdue.edu/articles/thesis/A_Quantitative_Comparison_of_Pre-Trained_Model_Registries_to_Traditional_Software_Package_Registries/25686447/1 PDF: https://hammer.purdue.edu/ndownloader/files/46096152
Tomasz Tunguz, Apr 24, 2024, A Shift in LLM Marketing : The Rise of the B2B Model, https://tomtunguz.com/snowflake-arctic-model/
Nathan Lambert, APR 18, 2024, Llama 3: Scaling open LLMs to AGI, https://www.interconnects.ai/p/llama-3-and-scaling-open-llms
John Loeffler, April 19, 2024, Meta rolls out new Meta AI website, and it might just bury Microsoft and Google's AI dreams, Tech Radar, https://www.techradar.com/computing/meta-rolls-out-new-meta-ai-website-and-it-might-just-bury-microsoft-and-googles-ai-dreams
Robert Wolfe, Isaac Slaughter, Bin Han, Bingbing Wen, Yiwei Yang, Lucas Rosenblatt, Bernease Herman, Eva Brown, Zening Qu, Nic Weber, and Bill Howe. 2024. Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT Even in Low-Resource Settings. In ACMConference on Fairness, Accountability, and Transparency (ACM FAccT ’24), June 3–6, 2024, Rio de Janeiro, Brazil. ACM, New York, NY, USA, 18 pages. https://doi.org/10.1145/3630106.3658966 https://arxiv.org/pdf/2405.16820
Michael Nuñez, February 6, 2024, Meet ‘Smaug-72B’: The new king of open-source AI, Venture Beat, https://venturebeat.com/ai/meet-smaug-72b-the-new-king-of-open-source-ai/
Sharon Machlis, March 28, 2024, 5 easy ways to run an LLM locally, InfoWorld, https://www.infoworld.com/article/3705035/5-easy-ways-to-run-an-llm-locally.html
Ebtesam Almazrouei, Hamza Alobeidli, Abdulaziz Alshamsi, Alessandro Cappelli, Ruxandra Cojocaru, Mérouane Debbah, Étienne Goffinet, Daniel Hesslow, Julien Launay, Quentin Malartic, Daniele Mazzotta, Badreddine Noune, Baptiste Pannier, Guilherme Penedo, 29 Nov 2023, The Falcon Series of Open Language Models, https://arxiv.org/abs/2311.16867
Ankit Patel, June 14, 2024, NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models, https://blogs.nvidia.com/blog/nemotron-4-synthetic-data-generation-llm-training/
David Spuler, March 2024, Chapter 5. Design Choices & Architectures, Generative AI in C++: Coding Transformers and LLMs, https://www.amazon.com/dp/B0CXJKCWX9
Intel, Apr 25, 2024, Deployment of Llama3 on Your AI PC with OpenVINO™, https://medium.com/openvino-toolkit/deployment-of-llama3-on-your-ai-pc-with-openvino-b58e961501d6
Bin Xiao, Burak Kantarci, Jiawen Kang, Dusit Niyato, Mohsen Guizani, 18 Jun 2024 (v2), Efficient Prompting for LLM-based Generative Internet of Things, https://arxiv.org/abs/2406.10382
Elizabeth Gibney, 19 June 2024, Not all ‘open source’ AI models are actually open: here’s a ranking, Nature, https://www.nature.com/articles/d41586-024-02012-5
Liesenfeld, A., Dingemanse, M., 2024, Rethinking open source generative AI: open washing and the EU AI Act, In FAccT '24: Proc. 2024 ACM Conf. on Fairness, Accountability, and Transparency 1774–1787 (ACM, 2024). https://dl.acm.org/doi/10.1145/3630106.3659005
William Gallagher, Jun 19, 2024, Apple researchers add 20 more open-source models to improve text and image AI, https://appleinsider.com/articles/24/06/19/apple-researchers-add-20-more-open-source-models-to-improve-text-and-image-ai
Piotr Skalski, June 20, 2024, Florence-2: Open Source Vision Foundation Model by Microsoft, https://blog.roboflow.com/florence-2/
Waleed Kadous, August 23, 2023, Llama 2 is about as factually accurate as GPT-4 for summaries and is 30X cheaper, https://www.anyscale.com/blog/llama-2-is-about-as-factually-accurate-as-gpt-4-for-summaries-and-is-30x-cheaper Code: https://github.com/anyscale/factuality-eval
Ben Wodecki, November 16, 2023, Generative AI Projects More Than Triple on GitHub in 2023, https://aibusiness.com/nlp/gen-ai-projects-soar-more-than-triple-on-github
Valentina Alto, 2024, Chapter 3: Choosing an LLM for Your Application, Building LLM-Powered Applications: Create intelligence apps and agents with large language models, Packt Publishing, https://www.amazon.com/Building-LLM-Apps-Intelligent-Language/dp/1835462316/
Clement Farabet, Tris Warkentin, Jun 27, 2024 Gemma 2 is now available to researchers and developers, https://blog.google/technology/developers/google-gemma-2/
Meta, July 23, 2024, Introducing Llama 3.1: Our most capable models to date, https://ai.meta.com/blog/meta-llama-3-1/
Mark Zuckerberg, July 23, 2024 Open Source AI Is the Path Forward https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/
Vince Lam, Mar 12, 2024, 50+ Open-Source Options for Running LLMs Locally, https://medium.com/thedeephub/50-open-source-options-for-running-llms-locally-db1ec6f5a54f
Michael Nuñez, July 18, 2024, Groq’s open-source Llama AI model tops leaderboard, outperforming GPT-4o and Claude in function calling, https://venturebeat.com/ai/groq-open-source-llama-ai-model-tops-leaderboard-outperforming-gpt-4o-and-claude-in-function-calling/
Washington Post, 2024, Meta releases open-source AI model it says rivals OpenAI, Google tech, https://www.washingtonpost.com/technology/2024/07/23/meta-new-ai-llama-open/
AIM, 2024, Mistral AI Unveils Mistral Large 2, Beats Llama 3.1 on Code and Math, https://analyticsindiamag.com/ai-news-updates/mistral-ai-unveils-mistral-large-2-beats-llama-3-1-on-code-and-math/
David Linthicum, Aug 02, 2024, Small language models and open source are transforming AI, https://www.infoworld.com/article/3480593/small-language-models-and-open-source-are-transforming-ai.html
Level Up Coding, Aug 2024, Google open-sources the most powerful small model on the edge: 2B parameters surpass GPT-3.5-Turbo, and Apple 15Pro runs fast, https://levelup.gitconnected.com/google-open-sources-the-most-powerful-small-model-on-the-edge-2b-parameters-surpass-gpt-3-5-turbo-c0b13f96997c
Michael Nuñez, August 26, 2024, Aleph Alpha unveils EU-compliant AI: A new era for transparent machine learning, https://venturebeat.com/ai/aleph-alpha-unveils-eu-compliant-ai-a-new-era-for-transparent-machine-learning/
Shubham Sharma, August 29, 2024, Meta leads open-source AI boom, Llama downloads surge 10x year-over-year, https://venturebeat.com/ai/meta-leads-open-source-ai-boom-llama-downloads-surge-10x-year-over-year/
Chandra Irugalbandara, Ashish Mahendra, Roland Daynauth, Tharuka Kasthuri Arachchige, Jayanaka Dantanarayana, Krisztian Flautner, Lingjia Tang, Yiping Kang, Jason Mars, 16 Apr 2024 (v3), Scaling Down to Scale Up: A Cost-Benefit Analysis of Replacing OpenAI's LLM with Open Source SLMs in Production, https://arxiv.org/abs/2312.14972
Shrestha, Y.R., von Krogh, G. & Feuerriegel, S., 2023, Building open-source AI. Nat Comput Sci 3, 908–911 (2023). https://doi.org/10.1038/s43588-023-00540-0 https://www.nature.com/articles/s43588-023-00540-0
Abhinand, Aug 20, 2024, Self-Hosting LLaMA 3.1 70B (or any ~70B LLM) Affordably, https://abhinand05.medium.com/self-hosting-llama-3-1-70b-or-any-70b-llm-affordably-2bd323d72f8d
David Spuler, March 2024, Open Source Models, in Generative AI in C++, https://www.aussieai.com/book/ch5-open-source-models
Carl Franzen, September 5, 2024, Meet the new, most powerful open source AI model in the world: HyperWrite’s Reflection 70B, https://venturebeat.com/ai/meet-the-new-most-powerful-open-source-ai-model-in-the-world-hyperwrites-reflection-70b/
Asif Razzaq, September 5, 2024, Yi-Coder Released by 01.AI: A Powerful Small-Scale Code LLM Series, Delivering Exceptional Performance in Code Generation, Editing, and Long-Context Comprehension, https://www.marktechpost.com/2024/09/05/yi-coder-released-by-01-ai-a-powerful-small-scale-code-llm-series-delivering-exceptional-performance-in-code-generation-editing-and-long-context-comprehension/
Michael Nuñez, September 16, 2024, SambaNova challenges OpenAI’s o1 model with Llama 3.1-powered demo on HuggingFace, https://venturebeat.com/ai/sambanova-challenges-openais-o1-model-with-llama-3-1-powered-demo-on-huggingface/
Meta, August 29, 2024, With 10x growth since 2023, Llama is the leading engine of AI innovation https://ai.meta.com/blog/llama-usage-doubled-may-through-july-2024/
Michael Nuñez, October 1, 2024, Nvidia just dropped a bombshell: Its new AI model is open, massive, and ready to rival GPT-4, https://venturebeat.com/ai/nvidia-just-dropped-a-bombshell-its-new-ai-model-is-open-massive-and-ready-to-rival-gpt-4/
Wenliang Dai, Nayeon Lee, Boxin Wang, Zhuoling Yang, Zihan Liu, Jon Barker, Tuomas Rintamaki, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping, 17 Sep 2024, NVLM: Open Frontier-Class Multimodal LLMs, NVIDIA, https://arxiv.org/abs/2409.11402 https://huggingface.co/nvidia/NVLM-D-72B https://nvlm-project.github.io/
Sean Michael Kerner, October 20, 2024, IBM debuts open source Granite 3.0 LLMs for enterprise AI, https://venturebeat.com/ai/ibm-debuts-open-source-granite-3-0-llms-for-enterprise-ai/
Meta, October 18, 2024, Sharing new research, models, and datasets from Meta FAIR, https://ai.meta.com/blog/fair-news-segment-anything-2-1-meta-spirit-lm-layer-skip-salsa-lingua/
Matt Marshall, October 24, 2024, The enterprise verdict on AI models: Why open source will win, https://venturebeat.com/ai/the-enterprise-verdict-on-ai-models-why-open-source-will-win/
Meta, October 24, 2024, Introducing quantized Llama models with increased speed and a reduced memory footprint, https://ai.meta.com/blog/meta-llama-quantized-lightweight-models/
Xingwu Sun, Yanfeng Chen, Yiqing Huang, Ruobing Xie, Jiaqi Zhu, Kai Zhang, Shuaipeng Li, Zhen Yang, Jonny Han, Xiaobo Shu, Jiahao Bu, (and many more authors), 4 Nov 2024, Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent, https://arxiv.org/abs/2411.02265 https://github.com/Tencent/Hunyuan-Large https://huggingface.co/tencent/Tencent-Hunyuan-Large
Robert Corwin Nov 2024, Running Large Language Models Privately: A comparison of frameworks, models, and costs, https://towardsdatascience.com/running-large-language-models-privately-a-comparison-of-frameworks-models-and-costs-ac33cfe3a462
Carl Franzen, October 31, 2024, Meta makes its MobileLLM open for researchers, posting full weights, https://venturebeat.com/ai/meta-makes-its-mobilellm-open-for-researchers-posting-full-weights/
Jason Perlow, Nov. 6, 2024, The best open-source AI models: All your free-to-use options explained: Here are the best open-source and free-to-use AI models for text, images, and audio, organized by type, application, and licensing considerations. https://www.zdnet.com/article/the-best-open-source-ai-models-all-your-free-to-use-options-explained/
Chris Wellons, November 10, 2024, Everything I've learned so far about running local LLMs, https://nullprogram.com/blog/2024/11/10/
Tegan Jones, 6 November, 2024, Open source AI: What it is and why it matters for business. We now have a definition for ‘open source AI’ and that’s important for business owners, especially when big tech doesn’t adhere to it. https://www.smartcompany.com.au/artificial-intelligence/open-source-ai-what-it-is-and-why-it-matters-for-business/
Qwen Team, November 28, 2024, QwQ: Reflect Deeply on the Boundaries of the Unknown, https://qwenlm.github.io/blog/qwq-32b-preview/
Ai2, November 26, 2024, OLMo 2: The best fully open language model to date, https://allenai.org/blog/olmo2
Kyle Wiggers, December 6, 2024, Meta unveils a new, more efficient Llama model, https://techcrunch.com/2024/12/06/meta-unveils-a-new-more-efficient-llama-model/
Tiernan Ray, Dec. 10, 2024, How Cerebras boosted Meta's Llama to 'frontier model' performance The company also demonstrates initial training of a one-trillion-parameter AI model on a single machine using conventional DDR5 memory chips. https://www.zdnet.com/article/how-cerebras-boosted-metas-llama-to-frontier-model-performance/
Ben Dickson, December 10, 2024, OpenAI’s o1 model doesn’t show its thinking, giving open source an advantage, https://venturebeat.com/ai/heres-how-openai-o1-might-lose-ground-to-open-source-models/
Inkit Padhi, Manish Nagireddy, Giandomenico Cornacchia, Subhajit Chaudhury, Tejaswini Pedapati, Pierre Dognin, Keerthiram Murugesan, Erik Miehling, Martín Santillán Cooper, Kieran Fraser, Giulio Zizzo, Muhammad Zaid Hameed, Mark Purcell, Michael Desmond, Qian Pan, Inge Vejsbjerg, Elizabeth M. Daly, Michael Hind, Werner Geyer, Ambrish Rawat, Kush R. Varshney, Prasanna Sattigeri, 10 Dec 2024, Granite Guardian, https://arxiv.org/abs/2412.07724 https://github.com/ibm-granite/granite-guardian (Open-sourcing of safety models with many capabilities.)
Team OLMo, Pete Walsh, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Shane Arora, Akshita Bhagia, Yuling Gu, Shengyi Huang, Matt Jordan, Nathan Lambert, Dustin Schwenk, Oyvind Tafjord, Taira Anderson, David Atkinson, Faeze Brahman, Christopher Clark, Pradeep Dasigi, Nouha Dziri, Michal Guerquin, Hamish Ivison, Pang Wei Koh, Jiacheng Liu, Saumya Malik, William Merrill, Lester James V. Miranda, Jacob Morrison, Tyler Murray, Crystal Nam, Valentina Pyatkin, Aman Rangapur, Michael Schmitz, Sam Skjonsberg, David Wadden, Christopher Wilhelm, Michael Wilson, Luke Zettlemoyer, Ali Farhadi, Noah A. Smith, Hannaneh Hajishirzi, 31 Dec 2024, 2 OLMo 2 Furious, https://arxiv.org/abs/2501.00656
NovaSky, Jan 2025, Sky-T1: Train your own O1 preview model within $450, https://novasky-ai.github.io/posts/sky-t1/
Edward Beeching, Lewis Tunstall, Sasha Rush Dec 16, 2024, Scaling Test Time Compute with Open Source Models, https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute
Charles Rollet, January 29, 2025, Zuck shrugs off DeepSeek, vows to spend hundreds of billions on AI, https://techcrunch.com/2025/01/29/zuck-shrugs-off-deepseek-vows-to-spend-hundreds-of-billions-on-ai/
Ryan Browne, Feb 4 2025, DeepSeek’s breakthrough emboldens open-source AI models like Meta’s Llama, https://www.cnbc.com/2025/02/04/deepseek-breakthrough-emboldens-open-source-ai-models-like-meta-llama.html
Maxwell Zeff, February 5, 2025, Researchers created an open rival to OpenAI’s o1 ‘reasoning’ model for under $50, https://techcrunch.com/2025/02/05/researchers-created-an-open-rival-to-openais-o1-reasoning-model-for-under-50/
Kyle Wiggers, January 11, 2025, Researchers open source Sky-T1, a ‘reasoning’ AI model that can be trained for less than $450,https://techcrunch.com/2025/01/11/researchers-open-source-sky-t1-a-reasoning-ai-model-that-can-be-trained-for-less-than-450/
R Szilágyi, 2024, OpenSource alternatives of Generative Artifical Intelligence for SME's, Journal of Agricultural Informatics, Vol. 15 No. 2 (2024), https://doi.org/10.17700/jai.2024.15.2.733 https://journal.magisz.org/index.php/jai/article/view/733 https://journal.magisz.org/index.php/jai/article/view/733/412
Kyle Wiggers, February 21, 2025, DeepSeek to open source parts of online services code, https://techcrunch.com/2025/02/21/deepseek-to-open-source-parts-of-online-services-code/
kinfey, Feb 27, 2025, Welcome to the new Phi-4 models - Microsoft Phi-4-mini & Phi-4-multimodal, https://techcommunity.microsoft.com/blog/educatordeveloperblog/welcome-to-the-new-phi-4-models---microsoft-phi-4-mini--phi-4-multimodal/4386037
Asif Razzaq, March 5, 2025, Qwen Releases QwQ-32B: A 32B Reasoning Model that Achieves Significantly Enhanced Performance in Downstream Task, https://www.marktechpost.com/2025/03/05/qwen-releases-qwq-32b-a-32b-reasoning-model-that-achieves-significantly-enhanced-performance-in-downstream-task/ (Features 32B parameters, 32K context length, 64 layers, RoPE, SwiGLU, RMSNorm, and attention enhancements.)
Nathan Lambert, Mar 14, 2025, Gemma 3, OLMo 2 32B, and the growing potential of open-source AI: Leading open-weight models and the first open-source model to clearly surpass GPT 3.5 (the very last version), https://www.interconnects.ai/p/gemma-3-olmo-2-32b-and-the-growing
Annika Kim Constantino, Apr 5 2025, Meta debuts new Llama 4 models, but most powerful AI model is still to come https://www.cnbc.com/2025/04/05/meta-debuts-new-llama-4-models-but-most-powerful-ai-model-is-still-to-come.html
Devansh, Jun 1, 2025, The Costly Open-Source LLM Lie: Open Source LLMs are not Free, https://machine-learning-made-simple.medium.com/the-costly-open-source-llm-lie-f83fdc5d5701
Nathan Lambert, Jul 04, 2025, The American DeepSeek Project: What I think the next goal for the open-source AI community is, https://www.interconnects.ai/p/the-american-deepseek-project
Jim Clyde Monge, Mar 18, 2024, xAI Releases Grok-1 — The Biggest Open-Source LLM, https://generativeai.pub/xai-releases-grok-1-the-biggest-open-source-llm-28fe8ab84575
Shubham Sharma, December 17, 2024, UAE’s Falcon 3 challenges open-source leaders amid surging demand for small AI models, https://venturebeat.com/ai/uaes-falcon-3-challenges-open-source-leaders-amid-surging-demand-for-small-ai-models/
Gemma Team, Google DeepMind, 12 March 2025, Gemma 3Technical Report, https://storage.googleapis.com/deepmind-media/gemma/Gemma3Report.pdf
Mistral AI, Mar 17, 2025, Mistral Small 3.1: SOTA. Multimodal. Multilingual. Apache 2.0, https://mistral.ai/news/mistral-small-3-1
Michael Nuñez, March 17, 2025, Mistral AI drops new open-source model that outperforms GPT-4o Mini with fraction of parameters, https://venturebeat.com/ai/mistral-ai-drops-new-open-source-model-that-outperforms-gpt-4o-mini-with-fraction-of-parameters/
Carl Franzen, March 17, 2025, Baidu delivers new LLMs ERNIE 4.5 and ERNIE X1 undercutting DeepSeek, OpenAI on cost — but they’re not open source (yet), https://venturebeat.com/ai/baidu-delivers-new-llms-ernie-4-5-and-ernie-x1-undercutting-deepseek-openai-on-cost-but-theyre-not-open-source-yet/
MiniMax: Aili Chen, Aonian Li, Bangwei Gong, Binyang Jiang, Bo Fei, Bo Yang, Boji Shan, Changqing Yu, Chao Wang, Cheng Zhu, Chengjun Xiao, Chengyu Du, Chi Zhang, Chu Qiao, Chunhao Zhang, Chunhui Du, Congchao Guo, Da Chen, Deming Ding, Dianjun Sun, Dong Li, Enwei Jiao, (and many more authors), 16 Jun 2025, MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention, https://arxiv.org/abs/2506.13585 https://github.com/MiniMax-AI/MiniMax-M1 (A 456B MoE reasoning model trained with RL and has various optimizations in training efficiency and attention kernel.)
Michael Nuñez, July 11, 2025, Moonshot AI’s Kimi K2 outperforms GPT-4 in key benchmarks — and it’s free, https://venturebeat.com/ai/moonshot-ais-kimi-k2-outperforms-gpt-4-in-key-benchmarks-and-its-free/ (One trillion parameters with 32B experts activated each time. Examines new training optimizer MuonClip as more efficient and more stable than variants of AdamW for training.)
Michael Nuñez, August 14, 2025, That ‘cheap’ open-source AI model is actually burning through your compute budget, https://venturebeat.com/ai/that-cheap-open-source-ai-model-is-actually-burning-through-your-compute-budget/ (Open-source models use more tokens.)
Tim, Nous Research, Aug 14, 2025, Measuring Thinking Efficiency in Reasoning Models: The Missing Benchmark, https://nousresearch.com/measuring-thinking-efficiency-in-reasoning-models-the-missing-benchmark/
Kaitao Chen, Mianxin Liu, Daoming Zong, Chaoyue Ding, Shaohao Rui, Yankai Jiang, Mu Zhou, Xiaosong Wang, 8 Aug 2025, Mediator-Guided Multi-Agent Collaboration among Open-Source Models for Medical Decision-Making, https://arxiv.org/abs/2508.05996
Mithun Saha, Maxwell A. Xu, Wanting Mao, Sameer Neupane, James M. Rehg, Santosh Kumar, 23 Jul 2025, Pulse-PPG: An Open-Source Field-Trained PPG Foundation Model for Wearable Applications Across Lab and Field Settings, https://arxiv.org/abs/2502.01108
Eleftherios Tzanis and Michail E. Klontzas, 11 Aug 2025, mAIstro: an open-source multi-agentic system for automated end-to-end development of radiomics and deep learning models for medical imaging, https://arxiv.org/abs/2505.03785
Zihao Chen, Ji Zhuang, Jinyi Shen, Xiaoyue Ke, Xinyi Yang, Mingjie Zhou, Zhuoyao Du, Xu Yan, Zhouyang Wu, Zhenyu Xu, Jiangli Huang, Li Shang, Xuan Zeng, Fan Yang, 14 Aug 2025, AnalogSeeker: An Open-source Foundation Language Model for Analog Circuit Design, https://arxiv.org/abs/2508.10409
Vamsi Krishna Mulukutla, Sai Supriya Pavarala, Srinivasa Raju Rudraraju, Sridevi Bonthu, 19 Aug 2025, Evaluating Open-Source Vision Language Models for Facial Emotion Recognition against Traditional Deep Learning Models, https://arxiv.org/abs/2508.13524
Anindya Bijoy Das, Shibbir Ahmed and Shahnewaz Karim Sakib, 19 Aug 2025, Hallucinations and Key Information Extraction in Medical Texts: A Comprehensive Assessment of Open-Source Large Language Models, https://arxiv.org/abs/2504.19061
Ephraiem Sarabamoun, 12 Aug 2025, Special-Character Adversarial Attacks on Open-Source Language Model, https://arxiv.org/abs/2508.14070
Hang Fan, Yu Shi, Zongliang Fu, Shuo Chen, Wei Wei, Wei Xu, Jian Li, 8 Sep 2025, WindFM: An Open-Source Foundation Model for Zero-Shot Wind Power Forecasting, https://arxiv.org/abs/2509.06311
Jerin Yasmin, Wenxin Jiang, James C. Davis, Yuan Tian, 7 Sep 2025, Software Dependencies 2.0: An Empirical Study of Reuse and Integration of Pre-Trained Models in Open-Source Projects, https://arxiv.org/abs/2509.06085
Carl Franzen, September 24, 2025, Chinese food delivery app Meituan's open source AI model LongCat-Flash-Thinking rivals GPT-5, https://venturebeat.com/ai/chinese-food-delivery-firm-meituans-open-source-ai-model-longcat-flash
Carl Franzen, September 9, 2025, K2 Think arrives from UAE as 'world’s fastest open-source AI model', https://venturebeat.com/ai/k2-think-arrives-from-uae-as-worlds-fastest-open-source-ai-model (2,000 tokens per second for a 32B model.)
Federico Tavella, Amber Drinkwater, Angelo Cangelosi, 16 Sep 2025, Evaluating the Robustness of Open-Source Vision-Language Models to Domain Shift in Object Captioning, https://arxiv.org/abs/2506.19579
Leyi Pan, Sheng Guan, Zheyu Fu, Luyang Si, Zian Wang, Xuming Hu, Irwin King, Philip S. Yu, Aiwei Liu, Lijie Wen, 11 Sep 2025, MarkDiffusion: An Open-Source Toolkit for Generative Watermarking of Latent Diffusion Models, https://arxiv.org/abs/2509.10569
Sidharth Surapaneni, Hoang Nguyen, Jash Mehta, Aman Tiwari, Oluwanifemi Bamgbose, Akshay Kalkunte, Sai Rajeswar, Sathwik Tejaswi Madhusudhan, 9 Sep 2025, LALM-Eval: An Open-Source Toolkit for Holistic Evaluation of Large Audio Language Models, https://arxiv.org/abs/2509.08031
Zihao Wang, Muyao Li, Kaichen He, Xiangyu Wang, Zhancun Mu, Anji Liu, Yitao Liang, 13 Sep 2025, OpenHA: A Series of Open-Source Hierarchical Agentic Models in Minecraft, https://arxiv.org/abs/2509.13347
Diptyaroop Maji, Kang Yang, Prashant Shenoy, Ramesh K Sitaraman, Mani Srivastava, 1 Oct 2025, CarbonX: An Open-Source Tool for Computational Decarbonization Using Time Series Foundation Models, https://arxiv.org/abs/2510.01521
Saroj Basnet, Shafkat Farabi, Tharindu Ranasinghe, Diptesh Kanoji, Marcos Zampieri, 13 Oct 2025, Evaluating Open-Source Vision-Language Models for Multimodal Sarcasm Detection, https://arxiv.org/abs/2510.11852
Chen Wang, Tianyu Peng, Wen Yang, Yinan Bai, Guangfu Wang, Jun Lin, Lanpeng Jia, Lingxiang Wu, Jinqiao Wang, Chengqing Zong, Jiajun Zhang, 27 Oct 2025, OpenS2S: Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model, https://arxiv.org/abs/2507.05177
Mahdi Cherakhloo, Arash Abbasi, Mohammad Saeid Sarafraz, Bijan Vosoughi Vahdat, 5 Oct 2025, Benchmarking Open-Source Large Language Models for Persian in Zero-Shot and Few-Shot Learning, https://arxiv.org/abs/2510.12807
Zekai Zhang, Mingwei Liu, Zhenxi Chen, Linxi Liang, Yuxuan Chen, Guangsheng Ou, Yanlin Wang, Dan Li, Xin Peng and Zibin Zheng, 5 Oct 2025, Generating High-Quality Datasets for Code Editing via Open-Source Language Models, https://arxiv.org/abs/2509.25203
Joris Depoortere, Johan Driesen, Johan Suykens and Hussain Syed Kazmi, 10 Oct 2025, SolNet: Open-source deep learning models for photovoltaic power forecasting across the globe, https://arxiv.org/abs/2405.14472
Fabrizio Dimino, Krati Saxena, Bhaskarjit Sarmah, Stefano Pasquali, 7 Oct 2025, Uncovering Representation Bias for Investment Decisions in Open-Source Large Language Models, https://arxiv.org/abs/2510.05702