Aussie AI

Limitations of LLMs

  • Last Updated 27 August, 2025
  • by David Spuler, Ph.D.

LLMs can do some amazing new things, but they also have a lot of limitations. This article is a deep dive into limitations in various categories:

  • Risks and safety
  • Reasoning limitations
  • Computational limitations

Safety Risks and Limitations

Your average LLM has problems with:

  • Inaccuracies or misinformation (wrong facts or omissions)
  • Biases (of many types)
  • Insensitivity (e.g. when writing eulogies).
  • Gullibility (not challenging the input text)
  • Hallucinations (plausible-looking made-up facts)
  • Confabulation (wrongly merging two sources)
  • Dangerous or harmful answers (e.g. wrong mushroom picking advice)
  • Plagiarism (in its training data set)
  • Paraphrasing (plagiarism-like)
  • Sensitive topics (the LLM requires training on each and every one)
  • Training data quality ("Garbage in, garbage out")
  • Alignment (people have purpose; LLMs only have language).
  • Security (e.g. "jailbreaks")
  • Refusal (knowing when it should)
  • Personally Identifiable Information (PII) (e.g., emails or phone numbers in training data)
  • Proprietary data leakage (e.g., trade secrets in an article used in a training data set)
  • Surfacing inaccurate or outdated information
  • Over-confidence (it knows not what it says)
  • Veneer of authority (users tend to believe the words)
  • Use for nefarious purposes (e.g., by hackers)
  • Transparency (of the data, of the guardrails, of how it works, etc.)
  • Privacy issues (sure, but Googling online has similar issues, so this isn't as new as everyone says)
  • Legal issues (copyright violations, patentability, copyrightability, and more)
  • Regulatory issues (inconsistent)
  • Unintended consequences

Reasoning Limitations

Let's begin with some of the limitations that have largely been solved:

  • Words about words (e.g. "words", "sentences", etc.)
  • Writing style, tone, reading level, etc.
  • Ending responses nicely with stop tokens and max tokens
  • Tool integrations (e.g. clocks, calendars, calculators)
  • Cut-off date for training data sets
  • Long contexts

Some other issues:

  • Explainability
  • Attribution (source citations)
  • Logical reasoning
  • Planning
  • Probabilistic non-deterministic method
  • Mathematical reasoning
  • Banal, bland, or overly formal writing
  • Math word problems
  • Crosswords and other word puzzles (e.g. anagrams, alliteration)
  • Repetition (e.g., if it has nothing new to add, it may repeat a prior answer, rather than admitting that)
  • Specialized domains (e.g. jargon, special meanings of words)
  • Prompt engineering requirements (awkward wordings! Nobody really talks like that.)
  • Oversensitivity to prompt variations (and yet, sadly, prompt engineering works)
  • Ambiguity (of input queries)
  • Over-explaining
  • Nonsense answers
  • Americanisms (e.g., word spellings and implied meanings, cultural issues like "football", etc.)
  • Model "drift" (decline in accuracy over time)
  • Non-repeatability (same question, different answer)
  • Novice assumption (not identifying a user's higher level of knowledge from words in the questions; dare I say it's a kind of "AI-splaining")
  • Words and meanings are not the same thing.
  • Gibberish output (usually a bug; Transformers are just C++ programs, you know)
  • Lack of common sense (although I know some people like that, too)
  • Lack of a "world model"
  • Lack of a sense of personal context (they don't understand what it means to be a person)
  • Time/temporal reasoning (the concept of things happening in sequence is tricky)
  • 3D scene visualization (LLMs struggle to understand the relationship between objects in the real world)
  • Sarcasm and satire (e.g. articles espousing the benefits of "eating rocks")
  • Spin, biased viewpoints, and outright disinformation/deception (of source content)
  • Going rogue (usually a bug, or is it?)
  • Trick questions (e.g., queries that look like common online puzzles, but aren't quite the same).
  • Falling back on training data (overly complex answers)
  • Detecting intentional deception or other malfeasance by users
  • LLMs asking follow-up questions to clarify user requests (this capability has been improving quickly).
  • Not correctly prioritizing parts of the request (i.e., given multiple requests in a prompt instruction, it doesn't always automatically know which things are most important to you)

Computational Limitations

There's really only one big problem with AI computation: it's slooow. Hence, the need for all of those expensive GPU chips. This leads to problems with:

  • Cloud data center execution is expensive.
  • AI phone execution problems (e.g., frozen phone, battery depletion, overheating)
  • AI PC execution problems (big models are still too slow to run)
  • Training data set requirements (they need to feed on lots of tokens)
  • Environmental impact (e.g., by one estimate, a ten-fold need of extra data center electricity for AI answers compared to non-AI internet searches)

More Research on Limitations

Research papers that cover various other AI limitations:

AI Books from Aussie AI



The Sweetest Lesson: Your Brain Versus AI The Sweetest Lesson: Your Brain Versus AI: new book on AI intelligence theory:
  • Your brain is 50 times bigger than the best AI engines.
  • Truly intelligent AI will require more compute!
  • Another case of the bitter lesson?
  • Maybe it's the opposite of that: the sweetest lesson.

Get your copy from Amazon: The Sweetest Lesson



RAG Optimization RAG Optimization: Accurate and Efficient LLM Applications: new book on RAG architectures:
  • Smarter RAG
  • Faster RAG
  • Cheaper RAG
  • Agentic RAG
  • RAG reasoning

Get your copy from Amazon: RAG Optimization



Generative AI in C++ Generative AI Applications book:
  • Deciding on your AI project
  • Planning for success and safety
  • Designs and LLM architectures
  • Expediting development
  • Implementation and deployment

Get your copy from Amazon: Generative AI Applications



Generative AI in C++ Generative AI programming book:
  • Generative AI coding in C++
  • Transformer engine speedups
  • LLM models
  • Phone and desktop AI
  • Code examples
  • Research citations

Get your copy from Amazon: Generative AI in C++



CUDA C++ Optimization CUDA C++ Optimization book:
  • Faster CUDA C++ kernels
  • Optimization tools & techniques
  • Compute optimization
  • Memory optimization

Get your copy from Amazon: CUDA C++ Optimization



CUDA C++ Optimization CUDA C++ Debugging book:
  • Debugging CUDA C++ kernels
  • Tools & techniques
  • Self-testing & reliability
  • Common GPU kernel bugs

Get your copy from Amazon: CUDA C++ Debugging

More AI Research

Read more about: