We Hallucinate Meaning
What are the theoretical limits of LLMs?
A more academic treatment is here: https://nextjournal.com/jbowles/julia-version-of-goldbergs-the-unreasonable-effectiveness-of-character-level-language-models?token=TZ1NdZycJE6DhMREAEEBMu
Short version: distributional semantics is THE principle underlying current LLM approaches. It’s a fancy word-counting algorithm that is sensitive to the positional context of each token. Resulting models build up probabilities for contextual positions.
Size of corpora is a limit. The bigger the corpus, the greater number of probabilities, the more you can generate novel combinations. Patterns derived from the corpora are structural in nature, not inferential. LLMs have no formal representation of meaning. Humans bring the meaning. The LLM has no formal semantic representation for what the combination of tokens means… nor does it have any way to represent the reference in the real world.
Another way to say this: performance versus competence.
Humans are biased to infer internal/mental competence based on performative actions. When we see an LLM perform a generative action we assume some quality of competence. That is, we attribute forms of knowledge that differ in kind from the syntactic/structural/positional. In the same way we assume (credit Rodney Brooks) that when a human says "play frisbee" we attribute to the speaker some qualia of knowledge about a "frisbee."
The statement "play frisbee in a hurricane" is well-formed syntactically but it is problematic. You know why it is problematic. LLMs don't; they can’t tell me that the is problematic nor why.
We will engineer better compression for LLM models; we'll discover/invent more optimizations. But this is engineering. The problem with LLMs is not the engineering. It is the approach. Developing a new approach will likely result in a paradigm shift (Thomas Kuhn), which requires scientific and mathematical discovery. We can’t expect scientific progress in any timeframe.
What we can do is be aware that in the business world there isn’t much reason for investing in scientific progress when there’s a crap-ton of money to be made with current LLMs. It would be wise to be aware that many companies have invested billions (maybe trillions), in the last 10 years specifically, to bring AI tech to you as a paid service. That is, the game is rigged, the house wins, the sunk investments need their returns. The big companies need you to build LLM dependencies into your business.
I don’t personally like the idea of AI forced on us just so investments can return profits. But that’s the reality. It's also reality that LLMs (currently) have very hard limits about what they can represent(performance but not competence). Lastly, we need to recognize that humans have a bias that LLM apps can exploit: We behave as if the performance of the model points to some form of competence, but that competence is our hallucination.