Aussie AI

Next-Generation AI Architectures

Last Updated 22 October, 2025

by David Spuler, Ph.D.

Research on Next-Generation AI Architectures

Research papers include:

Badri Narayana Patro, Vijay Srinivas Agneeswaran, 24 Apr 2024, Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges, https://arxiv.org/abs/2404.16112
Sathya Krishnan Suresh, Shunmugapriya P, 24 Apr 2024 (v2), Towards smaller, faster decoder-only transformers: Architectural variants and their implications, https://arxiv.org/abs/2404.14462 Code: https://github.com/SkAndMl/gpt-variations (Focuses on three new variants of decoder-only Transformer architectures: ParallelGPT (p-gpt), LinearlyCompressedGPT (lc-gpt), and ConvCompressedGPT (cc-gpt).)
Zixuan Zhou, Xuefei Ning, Ke Hong, Tianyu Fu, Jiaming Xu, Shiyao Li, Yuming Lou, Luning Wang, Zhihang Yuan, Xiuhong Li, Shengen Yan, Guohao Dai, Xiao-Ping Zhang, Yuhan Dong, Yu Wang, 22 Apr 2024, A Survey on Efficient Inference for Large Language Models, https://arxiv.org/abs/2404.14294
Rob Toews, Sep 3, 2023, Transformers Revolutionized AI. What Will Replace Them? Forbes, https://www.forbes.com/sites/robtoews/2023/09/03/transformers-revolutionized-ai-what-will-replace-them/
David Spuler, March 2024, Chapter 43. Overview of AI Research, Generative AI in C++: Coding Transformers and LLMs, https://www.amazon.com/dp/B0CXJKCWX9
18 Apr 2024 (v2), The Efficiency Spectrum of Large Language Models: An Algorithmic Survey, Tianyu Ding, Tianyi Chen, Haidong Zhu, Jiachen Jiang, Yiqi Zhong, Jinxin Zhou, Guangzhi Wang, Zhihui Zhu, Ilya Zharkov, Luming Liang, https://arxiv.org/abs/2312.00678
Johannes Schneider, 1 Aug 2024, What comes after transformers? -- A selective survey connecting ideas in deep learning, https://arxiv.org/abs/2408.00386
Rohan Baskar Prabhakar, Hengrui Zhang, David Wentlzaff, 14 Aug 2024, Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference, https://arxiv.org/abs/2408.07802 (Modified Transformer architecture with parallelized sub-layers of attention and FFN.)
Cem Dilmegani, Jan 10, 2024, The Future of Large Language Models in 2024, https://research.aimultiple.com/future-of-large-language-models/
Bobby He, Thomas Hofmann, 31 May 2024 (v2), Simplifying Transformer Blocks, https://arxiv.org/abs/2311.01906 (Examines the removal of various Transformer sublayer components including skip connections, projection/value parameters, and normalization.)
Roy Lo, June 13, 2024, Defining AI 2.0: Beyond Generative AI, https://www.linkedin.com/pulse/defining-ai-20-beyond-generative-roy-lo-tbvie/
Ryan McNeal, Aug 27, 2024, ChatGPT and GPT-4 could get a sweet upgrade this fall with 'strawberry', https://www.androidauthority.com/openai-strawberry-ai-3475682/
Jiuxiang Gu, Yingyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou, 26 May 2024, Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers, https://arxiv.org/abs/2405.16411 (Higher-order attention using tensors to generalize QKV matrices.)
Joanne Chen, July 23, 2024, What’s Next After Transformers, https://foundationcapital.com/whats-next-after-transformers/
Martin_Casado, Aug 31, 2024, Tweet (State of LLMs) https://threadreaderapp.com/thread/1829905130512400775.html
Anil Ananthaswamy, August 30, 2024, A new way to build neural networks could make AI more understandable, https://www.technologyreview.com/2024/08/30/1103385/a-new-way-to-build-neural-networks-could-make-ai-more-understandable/?tpcc=NL_Marketing (About Kolmogorov-Arnold Networks or KANs.)
Niklas Muennighoff, Hongjin Su, Liang Wang, Nan Yang, Furu Wei, Tao Yu, Amanpreet Singh, Douwe Kiela, 17 Apr 2024 (v2), Generative Representational Instruction Tuning, https://arxiv.org/abs/2402.09906
Anson Ho, Tamay Besiroglu, Ege Erdil, David Owen, Robi Rahman, Zifan Carl Guo, David Atkinson, Neil Thompson, Jaime Sevilla, 9 Mar 2024, Algorithmic progress in language models, https://arxiv.org/abs/2403.05812
Chunting Zhou, Lili Yu, Arun Babu, Kushal Tirumala, Michihiro Yasunaga, Leonid Shamis, Jacob Kahn, Xuezhe Ma, Luke Zettlemoyer, Omer Levy, 20 Aug 2024, Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model, https://www.arxiv.org/abs/2408.11039 (Merging Transformer architectures with diffusion in training multimodal models.)
Cobus Greyling, Sep 2024, An AI Agent Architecture & Framework Is Emerging, https://cobusgreyling.medium.com/an-ai-agent-architecture-framework-is-emerging-addae3804f23
Douglas C. Youvan, September 27, 2024, Building and Running Large-Scale Language Models: The Infrastructure and Techniques Behind GPT-4 , https://www.researchgate.net/profile/Douglas-Youvan/publication/384398902_Building_and_Running_Large-Scale_Language_Models_The_Infrastructure_and_Techniques_Behind_GPT-4/links/66f6f4d3906bca2ac3d20e68/Building-and-Running-Large-Scale-Language-Models-The-Infrastructure-and-Techniques-Behind-GPT-4.pdf
Wenliang Dai, Nayeon Lee, Boxin Wang, Zhuoling Yang, Zihan Liu, Jon Barker, Tuomas Rintamaki, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping, 17 Sep 2024, NVLM: Open Frontier-Class Multimodal LLMs, NVIDIA, https://arxiv.org/abs/2409.11402 https://huggingface.co/nvidia/NVLM-D-72B https://nvlm-project.github.io/
Chengyue Wu, Xiaokang Chen, Zhiyu Wu, Yiyang Ma, Xingchao Liu, Zizheng Pan, Wen Liu, Zhenda Xie, Xingkai Yu, Chong Ruan, Ping Luo, 17 Oct 2024, Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation, https://arxiv.org/abs/2410.13848 https://github.com/deepseek-ai/Janus?tab=readme-ov-file
Carl Franzen, October 23, 2024, OpenAI researchers develop new model that speeds up media generation by 50X, https://venturebeat.com/ai/openai-researchers-develop-new-model-that-speeds-up-media-generation-by-50x/
Dr. Ashish Bamania, Nov 2024, XNets Are Here To Outcompete MLPs & KANs A deep dive into XNets, a new neural network architecture that outperforms MLPs, KANs, and PINNs across various benchmarks, along with a guide to building one from scratch. https://levelup.gitconnected.com/xnets-are-here-to-outcompete-mlps-kans-3ff569819165
Xin Li, Zhihong Xia, Hongkun Zhang, 28 Sep 2024, Cauchy activation function and XNet, https://arxiv.org/abs/2409.19221
Felix Petersen, Hilde Kuehne, Christian Borgelt, Julian Welzel, Stefano Ermon, 7 Nov 2024, Convolutional Differentiable Logic Gate Networks, 38th Conference on Neural Information Processing Systems (NeurIPS 2024), https://arxiv.org/abs/2411.04732
From Transformers to the Future: An In-Depth Exploration of Modern Language Model Architectures H Xu, Z Bi, H Tseng, X Song, P Feng, https://osf.io/n8r5j/download
Xin Dong, Yonggan Fu, Shizhe Diao, Wonmin Byeon, Zijia Chen, Ameya Sunil Mahabaleshwarkar, Shih-Yang Liu, Matthijs Van Keirsbilck, Min-Hung Chen, Yoshi Suhara, Yingyan Lin, Jan Kautz, Pavlo Molchanov, 20 Nov 2024, Hymba: A Hybrid-head Architecture for Small Language Models, https://arxiv.org/abs/2411.13676
Gil Dibner, Sep 25, 2024, Am I thinking about AI the right way? Angular Ventures, https://medium.com/angularventures/am-i-thinking-about-ai-the-right-way-4513760cd83e
Vincent-Pierre Berges, Barlas Oguz, December 12, 2024, Memory Layers at Scale, Meta, https://ai.meta.com/research/publications/memory-layers-at-scale/ https://github.com/facebookresearch/memory (Augmention of an LLM with an additional key-value associative memory, by replacing some FFNs with a "memory layer".)
Haiyang Wang, Yue Fan, Muhammad Ferjad Naeem, Yongqin Xian, Jan Eric Lenssen, Liwei Wang, Federico Tombari, Bernt Schiele, 30 Oct 2024, TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters, https://haiyang-w.github.io/tokenformer.github.io/ (Unique novel token-based attention mechanism.)
Luyang Liu, Jonas Pfeiffer, Jiaxing Wu, Jun Xie, Arthur Szlam, 23 Dec 2024, Deliberation in Latent Space via Differentiable Cache Augmentation, https://arxiv.org/abs/2412.17747 (Doing additional processing of the KV cache data to improve accuracy.)
Paul Sawers, January 23, 2025, Meta’s Yann LeCun predicts a ‘new AI architectures paradigm’ within 5 years and ‘decade of robotics’, https://techcrunch.com/2025/01/23/metas-yann-lecun-predicts-a-new-ai-architectures-paradigm-within-5-years-and-decade-of-robotics/
Akash Bajwa, Feb 03, 2025, Forward Deployed Engineers: A Means To An End For AI Startups: Capturing Business Logic And Expert Reasoning, https://akashbajwa.substack.com/p/forward-deployed-engineers-a-means (" AI truly is a new way of computing, and that means the better analogies are to computing itself. Transformers are the transistor, and mainframes are today’s models. The GUI is, arguably, still TBD.")
Marina Temkin, February 26, 2025, Inception emerges from stealth with a new type of AI model, https://techcrunch.com/2025/02/26/inception-emerges-from-stealth-with-a-new-type-of-ai-model/ (This is a "Diffusion Language Model" or DLM.)
Jacinta Bowler, Wed 5 Mar, Melbourne start-up launches 'biological computer' made of human brain cells, ABC Science, https://www.abc.net.au/news/science/2025-03-05/cortical-labs-neuron-brain-chip/104996484 (LOL, human brains strike back!)
Dr. Ashish Bamania, March 3rd, 2025, ‘FANformer’ Is The New Game-Changing Architecture For LLMs: A deep dive into how FANFormer architecture works and what makes it so powerful compared to Transformers, https://levelup.gitconnected.com/fanformer-is-the-new-game-changing-architecture-for-llms-d56999fab7f2
Yihong Dong, Ge Li, Xue Jiang, Yongding Tao, Kechi Zhang, Hao Zhu, Huanyu Liu, Jiazheng Ding, Jia Li, Jinliang Deng, Hong Mei, 28 Feb 2025, FANformer: Improving Large Language Models Through Effective Periodicity Modeling, https://www.arxiv.org/abs/2502.21309
lucalp, 24/06/2025, The Bitter Lesson is coming for Tokenization: a world of LLMs without tokenization is desirable and increasingly possible, https://lucalp.dev/bitter-lesson-tokenization-and-blt/
Dr. Ashish Bamania, Aug 2025, Hierarchical Reasoning Model: An AI Architecture That Beats OpenAI’s ‘o3-mini-high’ Is Here: A deep dive into the Hierarchical Reasoning Model (HRM) to understand its internals that help it outperform powerful reasoning models available to us today, https://ai.gopubby.com/hierarchical-reasoning-model-an-ai-architecture-that-beats-openais-o3-mini-high-is-here-2c3128ba1727
Kenneth Wolters, Aug 12, 2025, No AGI in Sight: What This Means for LLMs, https://kennethwolters.com/posts/no-agi/
Beining Wu, Jun Huang and Shui Yu, 25 Jul 2025, "X of Information'' Continuum: A Survey on AI-Driven Multi-dimensional Metrics for Next-Generation Networked Systems, https://arxiv.org/abs/2507.19657
Ayan Biswas, Terece L. Turton, Nishath Rajiv Ranasinghe, Shawn Jones, Bradley Love, William Jones, Aric Hagberg, Han-Wei Shen, Nathan DeBardeleben and Earl Lawrence, 18 Jul 2025, VizGenie: Toward Self-Refining, Domain-Aware Workflows for Next-Generation Scientific Visualization, https://arxiv.org/abs/2507.21124
Nadja R. Ging-Jehli, Russell K. Childers, Joshua Lu, Robert Gemma, Rachel Zhu, 11 Jul 2025, Gearshift Fellowship: A Next-Generation Neurocomputational Game Platform to Model and Train Human-AI Adaptability, https://arxiv.org/abs/2508.00850
Liangbo Ning, Ziran Liang, Zhuohang Jiang, Haohao Qu, Yujuan Ding, Wenqi Fan, Xiao-yong Wei, Shanru Lin, Hui Liu, Philip S. Yu, Qing Li, 5 Aug 2025, A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models, https://arxiv.org/abs/2503.23350
Fardis Nadimi, Payam Abdisarabshali, Kasra Borazjani, Jacob Chakareski, Seyyedali Hosseinalipour, 5 Aug 2025, Multi-Modal Multi-Task Federated Foundation Models for Next-Generation Extended Reality Systems: Towards Privacy-Preserving Distributed Intelligence in AR/VR/MR, https://arxiv.org/abs/2506.05683
Huan Zhang, Daokun Zhang, Kexin Meng, and Geoffrey I. Webb, 15 Aug 2025, Towards the Next-generation Bayesian Network Classifiers, https://arxiv.org/abs/2508.11145
Suman Saha and Fatemeh Rahbari and Farhan Sadique and Sri Krishna Chaitanya Velamakanni, Mahfuza Farooque and William J. Rothwell, 13 Aug 2025, Next-Gen Education: Enhancing AI for Microlearning, https://arxiv.org/abs/2508.11704
Jesmin Jahan Tithi and Hanjiang Wu and Avishaii Abuhatzera and Fabrizio Petrini, 19 Aug 2025, Scaling Intelligence: Designing Data Centers for Next-Gen Language Models, https://arxiv.org/abs/2506.15006
Pengsong Zhang, Xiang Hu, Guowei Huang, Yang Qi, Heng Zhang, Xiuxu Li, Jiaxing Song, Jiabin Luo, Yijiang Li, Shuo Yin, Chengxiao Dai, Eric Hanchen Jiang, Xiaoyan Zhou, Zhenfei Yin, Boqin Yuan, Jing Dong, Guinan Su, Guanren Qiao, Haiming Tang, Anghong Du, Lili Pan, Zhenzhong Lan, Xinyu Liu, 20 Aug 2025, aiXiv: A Next-Generation Open Access Ecosystem for Scientific Discovery Generated by AI Scientists, https://arxiv.org/abs/2508.15126
Long Jiang, Yang Yang, Ting Fong May Chui, Morgan Thornwell, Hoshin Vijai Gupta, 2 Sep 2025, Knowledge distillation as a pathway toward next-generation intelligent ecohydrological modeling systems, https://arxiv.org/abs/2509.01972
Linyue Cai, Yuyang Cheng, Xiaoding Shao, Huiming Wang, Yong Zhao, Wei Zhang, Kang Li, 16 Sep 2025, A Scenario-Driven Cognitive Approach to Next-Generation AI Memory, https://arxiv.org/abs/2509.13235
Jesse Gardner, Vladimir A. Baulin, 13 Sep 2025, Is the `Agent' Paradigm a Limiting Framework for Next-Generation Intelligent Systems?, https://arxiv.org/abs/2509.10875
Rok Cestnik, Erik A. Martens, 14 Sep 2025, Next-Generation Reservoir Computing for Dynamical Inference, https://arxiv.org/abs/2509.11338