Aussie AI

10. Concepts

  • Book Excerpt from "The Sweetest Lesson: Your Brain vs AI"
  • by David Spuler, Ph.D.

10. Concepts

 

 

 

“Learning to fly is not pretty

but flying is.”

— Satya Nadella, Hit Refresh, 2017.

 

 

 

What are Concepts?

It’s a good question and there’s not a really satisfactory answer. If concepts were a simple concept (haha), then AI reasoning models would be a lot easier to build. The basic ideas about concepts include:

  • High-level meaning
  • Semantics not syntax

Concepts are about meaning, not words. There are also different levels of concepts, such as:

  • Basic things — e.g., nouns like “cat”
  • Basic actions — e.g., verbs like “jump”
  • Abstract meanings — e.g., “symmetrical”
  • Emotional meanings — e.g., “happiness”

As you can see from the above, it’s not really clear what a concept really means (i.e., the “concept of a concept” is vague!). However, there’s one thing that’s clear about concepts:

    Humans are better at concepts than AI models.

Maybe words are the problem. Humans think in images and relationships and concepts, whereas words are somewhat plotted onto these ideas.

Hence, there’s a whole area of LLM research about having them think in “concepts” rather than words. The idea is to still use “tokens” but to have these tokens represent higher-levels as “concept tokens” rather than just words.

The terminology for AI reasoning about concepts is usually called “latent reasoning” because it’s done in the “latent space” of abstract representations. I guess that’s distinct from the “word space” where most LLMs are stuck at the moment.

The idea of concept tokens is not especially revolutionary. After all, in the theory of AI, we’ve already had:

  • Stop tokens
  • Pause tokens
  • Formatting tokens
  • Control tokens
  • Separator tokens
  • Fine-tuning tokens
  • Tool tokens

Hence, yeah sure, why not have concept tokens.

In fact, concept-level tokens are used in several major areas of AI intelligence research:

  • Prompt tuning (“prefix tuning”)
  • Chain-of-Thought decoding tokens (“reasoning tokens”)
  • Large Concept Models

Prompt tuning is a way to make the AI smarter by adding some special concept tokens to the input. This provides extra context to a user’s question that allows the LLM to better understand what it’s “reading” in the input context.

Reasoning tokens have been used as way to speed up the multi-hop reasoning models. Hence, they are more about efficiency than adding extra intelligence, though obviously the overall reasoning model is about adding capabilities. However, imagine what migth be possible if you started with a rewrite of the architecture and put a lot of concept tokens together in a big model.

Large Concept Models

The development of Large Concept Models, or LCMs, was pioneered by Meta in December, 2024, from its Facebook AI Research lab (Barrault et. al., 2024). As you may have correctly guessed, LCMs are models that:

    1. Use concept tokens, and

    2. Are large.

The LCM built by Meta operates on concepts at the sentence level. Hence, it tries to understand written sentences, rather than individual words, to discern the one or more major concepts that a sentence aims to describe. They train models using a huge amount of data, with over 2.7 trillion tokens of input training sets.

This approach generalizes the use of concept tokens from their specific uses as reasoning tokens in multi-hop inference, and prefix tuning with concept tokens as extra non-word prefix tokens in prompt engineering.

Concept tokens are central to LCMs. They are trained in concept tokens, and also perform inference in the concept token space. Hence, they’re not using the old-style “word tokens” as used by most LLMs today. Nevertheless, LCMs and LLMs have the same types of uses:

  • Generating text or answers
  • Summarization of inputs
  • Reasoning about answers

As shown in various research papers, this idea is workable at scales both small and large. Large Concept Models may indeed be a major step forward beyond the simple tokenized LLMs. The main advantages of using concepts are that the resulting models are:

  • Smarter — better understanding of concepts and ideas.
  • Faster — fewer tokens to process, because each token is “bigger.”
  • Understandable — you can trace what concepts were used to answer.

However, LCMs are still in the early stages of their theory, and even the Facebook paper says they are offering “an attempt at an architecture” rather than a finalized technology. One of the areas of difficulty in this method is to train an “encoder” that converts sentences into concept tokens. This encoder has to do “sentence-level embeddings” analysis. Maybe we’ll be going back to 2017 with an encoder-decoder architecture soon.

LCMs are certainly interesting, and they are one possible candidate for an extra leap needed to advance further towards achieving AGI. They’re competing against other “next-gen” ideas for AI intelligence, such as:

  • Reasoning models
  • Symbolic execution
  • Knowledge graphs
  • State Space Models (SSMs) such as Mamba and Hyena.
  • Embodied AI models

Time will tell which ideas win out.

References

Concept Tokens: Research papers on the use of concept tokens to represent higher-order types of thinking include:

  1. Sachin Kumar, Sep 17, 2024, Hidden Chain-of-Thought decoding: faster and efficient CoT decoding to improve reasoning of LLMs https://medium.com/@techsachin/hidden-chain-of-thought-decoding-faster-and-efficient-cot-decoding-to-improve-reasoning-of-llms-d95584bc9346 (Token reduction in CoT by compressing language tokens into an internal “hidden” concise token representation.)
  2. Tianqiao Liu, Zui Chen, Zitao Liu, Mi Tian, Weiqi Luo, 13 Sep 2024, Expediting and Elevating Large Language Model Reasoning via Hidden Chain-of-Thought Decoding, https://arxiv.org/abs/2409.08561
  3. Lance Eliot, Dec 18, 2024, Chain Of Continuous Thought Promises Mighty Boost For LLMs And Generative AI By Blowing Up The Fixation On Tokens, https://www.forbes.com/sites/lanceeliot/2024/12/18/chain-of-continuous-thought-promises-mighty-boost-for-llms-and-generative-ai-by-blowing-up-the-fixation-on-tokens/
  4. Kyle Orland, 13 Dec 2024, Are LLMs capable of non-verbal reasoning? Processing in the “latent space” could help AI with tricky logical questions, https://arstechnica.com/ai/2024/12/are-llms-capable-of-non-verbal-reasoning/
  5. Alex McFarland, December 16, 2024, Meta’s COCONUT: The AI Method That Thinks Without Language, https://www.unite.ai/metas-coconut-the-ai-method-that-thinks-without-language/
  6. Maxime Peyrard, Martin Josifoski, Robert West, 21 Mar 2024, The Era of Semantic Decoding, https://arxiv.org/abs/2403.14562
  7. Hanyu Zhang, Xiting Wang, Chengao Li, Xiang Ao, Qing He, 10 Jan 2025, Controlling Large Language Models Through Concept Activation Vectors, https://arxiv.org/abs/2501.05764 (Training a vector used to control the model on certain attributes.)
  8. Deqian Kong, Minglu Zhao, Dehong Xu, Bo Pang, Shu Wang, Edouardo Honig, Zhangzhang Si, Chuan Li, Jianwen Xie, Sirui Xie, Ying Nian Wu, 3 Feb 2025, Scalable Language Models with Posterior Inference of Latent Thought Vectors, https://arxiv.org/abs/2502.01567
  9. Jonas Geiping, Sean McLeish, Neel Jain, John Kirchenbauer, Siddharth Singh, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Tom Goldstein, 7 Feb 2025, Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach, https://arxiv.org/abs/2502.05171
  10. DiJia Su, Hanlin Zhu, Yingchen Xu, Jiantao Jiao, Yuandong Tian, Qinqing Zheng, 5 Feb 2025. Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning, https://arxiv.org/abs/2502.03275
  11. Jihoon Tack, Jack Lanchantin, Jane Yu, Andrew Cohen, Ilia Kulikov, Janice Lan, Shibo Hao, Yuandong Tian, Jason Weston, Xian Li, 12 Feb 2025, LLM Pretraining with Continuous Concepts, https://arxiv.org/abs/2502.08524
  12. Vivek K. Tiwari, 2025, Towards Practical Concept-Based Language Models: An Efficiency-Focused Implementation, https://www.researchgate.net/profile/Vivek-Tiwari-41/publication/388753941_Towards_Practical_Concept-Based_Language_Models_An_Efficiency-Focused_Implementation/links/67a4bf86461fb56424cc6b62/Towards-Practical-Concept-Based-Language-Models-An-Efficiency-Focused-Implementation.pdf
  13. J Liao, R Xie, S Li, X Wang, X Sun, Z Kang, X He, 2025, Multi-Grained Patch Training for Efficient LLM-based Recommendation, https://hexiangnan.github.io/papers/sigir25-PatchRec.pdf

Large Concept Models (LCMs). Research papers on the use of concept tokens and latent space reasoning in very large models include:

  1. LCM team, Loïc Barrault, Paul-Ambroise Duquenne, Maha Elbayad, Artyom Kozhevnikov, Belen Alastruey, Pierre Andrews, Mariano Coria, Guillaume Couairon, Marta R. Costa-jussà, David Dale, Hady Elsahar, Kevin Heffernan, João Maria Janeiro, Tuan Tran, Christophe Ropers, Eduardo Sánchez, Robin San Roman, Alexandre Mourachko, Safiyyah Saleem, Holger Schwenk, 15 Dec 2024 (v2), Large Concept Models: Language Modeling in a Sentence Representation Space, https://arxiv.org/abs/2412.08821 https://github.com/facebookresearch/large_concept_model (Model operates at the sentence concept level, using SONAR sentence embeddings.)
  2. Dr. Ashish Bamania, Dec 2024, Meta’s Large Concept Models (LCMs) Are Here To Challenge And Redefine LLMs: A deep dive into ‘Large Concept Model’, a novel language processing architecture and evaluating its performance against state-of-the-art LLMs, https://levelup.gitconnected.com/metas-large-concept-models-lcms-are-here-to-challenge-and-redefine-llms-7f9778f88a87
  3. Hussain Ahmad, Diksha Goel, 8 Jan 2025, The Future of AI: Exploring the Potential of Large Concept Models, https://arxiv.org/abs/2501.05487
  4. Giuliano Liguori, Jan 2025, Large Concept Models (LCM): A New Frontier in AI Beyond Token-Level Language Models, https://www.linkedin.com/pulse/large-concept-models-lcm-new-frontier-ai-beyond-giuliano-liguori--dnj3f/
  5. Jihoon Tack, Jack Lanchantin, Jane Yu, Andrew Cohen, Ilia Kulikov, Janice Lan, Shibo Hao, Yuandong Tian, Jason Weston, Xian Li, 12 Feb 2025, LLM Pretraining with Continuous Concepts, https://arxiv.org/abs/2502.08524
  6. Vishal Rajput, Feb 2025, Forget LLMs, It’s Time For Large Concept Models (LCMs), https://medium.com/aiguys/forget-llms-its-time-for-large-concept-models-lcms-05b75fe43185
  7. Vivek K. Tiwari, 2025, Towards Practical Concept-Based Language Models: An Efficiency-Focused Implementation, https://www.researchgate.net/profile/Vivek-Tiwari-41/publication/388753941_Towards_Practical_Concept-Based_Language_Models_An_Efficiency-Focused_Implementation/links/67a4bf86461fb56424cc6b62/Towards-Practical-Concept-Based-Language-Models-An-Efficiency-Focused-Implementation.pdf
  8. Datacamp, Feb 21, 2025, Large Concept Models: A Guide With Examples: Learn what large concept models are, how they differ from LLMs, and how their architecture leads to improvements in language processing, https://www.datacamp.com/blog/large-concept-models
  9. Mehul Gupta, Jan 5, 2025, Meta Large Concept Models (LCM): End of LLMs? What are LCMs and how is LCM different from LLMs, https://medium.com/data-science-in-your-pocket/meta-large-concept-models-lcm-end-of-llms-68cb0c5cd5cf
  10. By AI Papers Academy, 3 January 2025, Large Concept Models (LCMs) by Meta: The Era of AI After LLMs? https://aipapersacademy.com/large-concept-models/
  11. Andrea Viliotti, 20 Dec 2024, Large Concept Model (LCM): a new paradigm for large-scale semantic reasoning in AI, https://www.andreaviliotti.it/post/large-concept-model-lcm-a-new-paradigm-for-large-scale-semantic-reasoning-in-ai
  12. Leadership in AI, January, 2025, Meta’s stunning LCM large concept models for artificial intelligence — they are thinking now! https://www.youtub e.com/watch?v=u Z3HCw8ApQ,
  13. Lance Eliot, Jan 06, 2025, AI Is Breaking Free Of Token-Based LLMs By Upping The Ante To Large Concept Models That Devour Sentences And Adore Concepts, https://www.forbes.com/sites/lanceeliot/2025/01/06/ai-is-breaking-free-of-token-based-llms-by-upping-the-ante-to-large-concept-models-that-devour-sentences-and-adore-concepts/
  14. Zen the innovator, Jan 5, 2025, Large Concept Models (LCMs), https://medium.com/@ThisIsMeIn360VR/large-concept-models-lcms-d59b86531ef6
  15. Debabrata Pruseth, Jan 2025, LCMs: Large Concept Models – The Path to AGI ( Artificial General Intelligence) & The Future of AI Thinking, https://debabratapruseth.com/lcms-large-concept-models-the-path-to-agi-the-future-of-ai-thinking/
  16. Asif Razzaq, December 15, 2024, Meta AI Proposes Large Concept Models (LCMs): A Semantic Leap Beyond Token-based Language Modeling, https://www.marktechpost.com/2024/12/15/meta-ai-proposes-large-concept-models-lcms-a-semantic-leap-beyond-token-based-language-modeling/
  17. Aniket Hingane, Dec 27, 2024, Practical Advancements in AI: How Large Concept Models Are Redefining the Landscape of LLMs, https://medium.com/@learn-simplified/practical-advancements-in-ai-how-large-concept-models-are-redefining-the-landscape-of-llms-b0220296458b
  18. Siddhant Rai and Vizuara AI, Dec 30, 2024, Large Concept models : Language Modeling in a Sentence Representation Space: Re-imagining the core principles behind representation generation in foundation model, https://vizuara.substack.com/p/large-concept-models-language-modeling?
  19. J Liao, R Xie, S Li, X Wang, X Sun, Z Kang, X He, 2025, Multi-Grained Patch Training for Efficient LLM-based Recommendation, https://hexiangnan.github.io/papers/sigir25-PatchRec.pdf
  20. Ignacio de Gregorio, June 2025, What If We Are All Wrong About AI? The contrarian bet by Meta, in plain English, https://medium.com/@ignacio.de.gregorio.noblejas/what-if-we-are-all-wrong-about-ai-f33a3c64055c

Reasoning Tokens. Research papers on reasoning tokens as a type of higher-level reasoning method include:

  1. Ignacio de Gregorio Noblejas, September 15, 2024, OpenAI Launches o1. Here’s All You Need to Know, https://thetechoasis.beehiiv.com/p/openai-launches-o1-heres-need-know
  2. Shibo Hao, Sainbayar Sukhbaatar, DiJia Su, Xian Li, Zhiting Hu, Jason Weston, Yuandong Tian, 9 Dec 2024, Training Large Language Models to Reason in a Continuous Latent Space, https://arxiv.org/abs/2412.06769 (Performing reasoning in a model trained to operate in the embedding vector space, rather than more directly in the token space.)
  3. Luyang Liu, Jonas Pfeiffer, Jiaxing Wu, Jun Xie, Arthur Szlam, 23 Dec 2024, Deliberation in Latent Space via Differentiable Cache Augmentation, https://arxiv.org/abs/2412.17747 (Augmenting the KV cache with reasoning information so that decoding will mimic multi-step reasoning with fewer tokens required for intermediate steps.)
  4. Sachin Goyal, Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, Vaishnavh Nagarajan, 21 Apr 2024 (v3), Think before you speak: Training Language Models With Pause Tokens, https://arxiv.org/abs/2310.02226 (Inserting extra “pause tokens” that trigger the LLM to perform extra reasoning during the decoding phase.)
  5. Jacob Pfau, William Merrill, Samuel R. Bowman, 24 Apr 2024, Let’s Think Dot by Dot: Hidden Computation in Transformer Language Models, https://arxiv.org/abs/2404.15758 (Use of dummy “filler tokens” similar to “pause tokens” or “reasoning tokens” to aid multi-step reasoning in decoding.)
  6. Eric Zelikman, Georges Harik, Yijia Shao, Varuna Jayasiri, Nick Haber, Noah D. Goodman, 18 Mar 2024 (v2), Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking, https://arxiv.org/abs/2403.09629 (Introduces answers between a start-of-thought and end-of-thought meta-token for reasoning.)
  7. Lance Eliot, Dec 18, 2024, Chain Of Continuous Thought Promises Mighty Boost For LLMs And Generative AI By Blowing Up The Fixation On Tokens, https://www.forbes.com/sites/lanceeliot/2024/12/18/chain-of-continuous-thought-promises-mighty-boost-for-llms-and-generative-ai-by-blowing-up-the-fixation-on-tokens/
  8. Xuan Shen, Yizhou Wang, Xiangxi Shi, Yanzhi Wang, Pu Zhao, Jiuxiang Gu, 31 Jan 2025, Efficient Reasoning with Hidden Thinking, https://arxiv.org/abs/2501.19201
  9. DiJia Su, Hanlin Zhu, Yingchen Xu, Jiantao Jiao, Yuandong Tian, Qinqing Zheng, 5 Feb 2025. Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning, https://arxiv.org/abs/2502.03275
  10. Jim the AI Whisperer, Feb 2025, I hacked Perplexity AI’s full system prompt when I shared my own cognitive vulnerabilities with it: How I used my own scrambled brain to outwit Perplexity AI, https://medium.com/the-generator/prompt-hacking-perplexity-ai-system-instructions-7aa6ee923060
  11. Kongcheng Zhang, Qi Yao, Baisheng Lai, Jiaxing Huang, Wenkai Fang, Dacheng Tao, Mingli Song, Shunyu Liu, 19 Feb 2025, Reasoning with Reinforced Functional Token Tuning, https://arxiv.org/abs/2502.13389
  12. Ziang Ye, Zhenru Zhang, Yang Zhang, Jianxin Ma, Junyang Lin, Fuli Feng, 19 Dec 2024, Disentangling Reasoning Tokens and Boilerplate Tokens For Language Model Fine-tuning, https://arxiv.org/abs/2412.14780
  13. Zhenyi Shen, Hanqi Yan, Linhai Zhang, Zhanghao Hu, Yali Du, Yulan He, 28 Feb 2025, CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation, https://arxiv.org/abs/2502.21074
  14. Guanghao Li, Wenhao Jiang, Mingfeng Chen, Yan Li, Hao Yu, Shuting Dong, Tao Ren, Ming Tang, Chun Yuan, 30 May 2025, SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought, https://arxiv.org/abs/2505.24181

Prompt Tuning. Research papers on prompt tuning or prefix tuning, which is adding concept tokens to prompts for easier processing, include:

  1. IBM, 2024, What is prompt-tuning?, https://research.ibm.com/blog/what-is-ai-prompt-tuning
  2. Abhinav Jain, Swarat Chaudhuri, Thomas Reps, Chris Jermaine, 24 May 2024, Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation, https://arxiv.org/abs/2405.15282
  3. MohammadAli SadraeiJavaeri, Ehsaneddin Asgari, Alice Carolyn McHardy, Hamid Reza Rabiee, 7 Jun 2024, SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings, https://arxiv.org/abs/2406.05279
  4. Martin Wistuba, Prabhu Teja Sivaprasad, Lukas Balles, Giovanni Zappella, 5 Jun 2024, Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need, https://arxiv.org/abs/2406.03216
  5. Xuyang Wu, Zhiyuan Peng, Sravanthi Rajanala, Hsin-Tai Wu, Yi Fang, 31 May 2024, Passage-specific Prompt Tuning for Passage Reranking in Question Answering with Large Language Models, https://arxiv.org/abs/2405.20654
  6. Wei Zhu, Aaron Xuxiang Tian, Congrui Yin, Yuan Ni, Xiaoling Wang, Guotong Xie, 7 Jun 2024 (v2), IAPT: Instruction-Aware Prompt Tuning for Large Language Models, https://arxiv.org/abs/2405.18203
  7. Mengwei Xu, Wangsong Yin, Dongqi Cai, Rongjie Yi, Daliang Xu, Qipeng Wang, Bingyang Wu, Yihao Zhao, Chen Yang, Shihe Wang, Qiyang Zhang, Zhenyan Lu, Li Zhang, Shangguang Wang, Yuanchun Li, Yunxin Liu, Xin Jin, Xuanzhe Liu, 16 Jan 2024, A Survey of Resource-efficient LLM and Multimodal Foundation Models, https://arxiv.org/abs/2401.08092 Project: https://github.com/UbiquitousLearning/Efficient_Foundation_Model_Survey
  8. Tianyu Ding, Tianyi Chen, Haidong Zhu, Jiachen Jiang, Yiqi Zhong, Jinxin Zhou, Guangzhi Wang, Zhihui Zhu, Ilya Zharkov, Luming Liang, 18 Apr 2024 (v2), The Efficiency Spectrum of Large Language Models: An Algorithmic Survey, https://arxiv.org/abs/2312.00678
  9. M Xu, D Cai, W Yin, S Wang, X Jin, X Liu, 2024, Resource-efficient Algorithms and Systems of Foundation Models: A Survey, ACM Computing Surveys, https://dl.acm.org/doi/pdf/10.1145/3706418
  10. Brian Lester, Rami Al-Rfou, and Noah Constant, 2021, The Power of Scale for Parameter-Efficient Prompt Tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3045–3059, Online and Punta Cana, Dominican Republic, Association for Computational Linguistics, https://aclanthology.org/2021.emnlp-main.243/
  11. Shreyansh Shah, Oct 18, 2023, Prompt Tuning: A Powerful Technique for Adapting LLMs to New Tasks, https://medium.com/@shahshreyansh20/prompt-tuning-a-powerful-technique-for-adapting-llms-to-new-tasks-6d6fd9b83557
  12. Data Camp, May 19, 2024, Understanding Prompt Tuning: Enhance Your Language Models with Precision, https://www.datacamp.com/tutorial/understanding-prompt-tuning
  13. Sergey Sedov, Sumanth Bharadwaj Hachalli Karanam, Venu Gopal Kadamba, 24 Dec 2024, Exploring Embedding Priors in Prompt-Tuning for Improved Interpretability and Control, https://arxiv.org/abs/2412.18582
  14. Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, Jie Tang, 20 Mar 2022 (v3), P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks, Proceedings of the 60th Annual Meeting of the Association of Computational Linguistics, 2022, https://arxiv.org/abs/2110.07602 https://github.com/THUDM/P-tuning-v2 (Extends prompt tuning with extra soft prompt tokens at every layer, not just at the start of the input.)
  15. Haowei Zhu, Fangyuan Zhang, Rui Qin, Tianxiang Pan, Junhai Yong, Bin Wang, 24 Dec 2024 (v2), Semantic Hierarchical Prompt Tuning for Parameter-Efficient Fine-Tuning, https://arxiv.org/abs/2412.16956
  16. Xiang Lisa Li, Percy Liang, 1 Jan 2021, Prefix-Tuning: Optimizing Continuous Prompts for Generation, https://arxiv.org/abs/2101.00190 (Precursor to prompt tuning.)
  17. Andrea Matarazzo, Riccardo Torlone, 3 Jan 2025, A Survey on Large Language Models with some Insights on their Capabilities and Limitations, https://arxiv.org/abs/2501.04040 (Broad survey with many LLM topics covered from history to architectures to optimizations.)
  18. Qi Sun, Edoardo Cetin, Yujin Tang, 14 Jan 2025 (v2), Transformer2: Self-adaptive LLMs, https://arxiv.org/abs/2501.06252 (Using a vector to fine-tune dynamically.)
  19. Liu Yang, Ziqian Lin, Kangwook Lee, Dimitris Papailiopoulos, Robert Nowak, 16 Jan 2025, Task Vectors in In-Context Learning: Emergence, Formation, and Benefit, https://arxiv.org/abs/2501.09240
  20. Dan Zhang, Tao Feng, Lilong Xue, Yuandong Wang, Yuxiao Dong, Jie Tang, 23 Jan 2025, Parameter-Efficient Fine-Tuning for Foundation Models, https://arxiv.org/abs/2501.13787
  21. X Li, C Jiang, Nov 2024, Optimizing Prompt Engineering Methods for Enhanced Logical Reasoning in Transformer Models, RMEL ’24, November 4–7, 2024, Hangzhou, China, https://www.researchgate.net/profile/Xiaoyan-Li-42/publication/389182048_Optimizing_Prompt_Engineering_Methods_for_Enhanced_Logical_Reasoning_in_Transformer_Models/links/67b82fa9461fb56424e3fc72/Optimizing-Prompt-Engineering-Methods-for-Enhanced-Logical-Reasoning-in-Transformer-Models.pdf https://github.com/xiaoyanLi629/RMELS2024

 

Online: Table of Contents

PDF: Free PDF book download

Buy: The Sweetest Lesson: Your Brain vs AI

The Sweetest Lesson: Your Brain Versus AI The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
  • Your brain is 50 times bigger than the best AI engines.
  • Truly intelligent AI will require more compute!
  • Another case of the bitter lesson?
  • Maybe it's the opposite of that: the sweetest lesson.

Get your copy from Amazon: The Sweetest Lesson