"Language models' outputs are fabrications, even if they seem real"

This title was summarized by AI from the post below.

“All outputs are hallucinations i.e. fabricated and ungrounded. Many of these outputs happen to match reality when there’s abundant training data and repetition” Yeah that really nails what it is like working with language models. Very insightful post.

View profile for Simon Wardley

Going through another AI horror story. These tools are great but please remember the following when you are using LLMs/LMMs :- 1) All outputs are hallucinations i.e. fabricated and ungrounded. Many of these outputs happen to match reality when there’s abundant training data and repetition, so they look useful on common tasks. But they cannot do research. These machines are stochastic parrots (Bender et al), they are pattern matchers and not reasoning engines. 2) These systems will happily invent plausible seeming but unverified detail. That’s a design feature not a bug, they are optimised for coherence, not truth. 3) These systems do not understand what they are creating. The use of tools and guardrails is mostly to convince you of their correctness and to hide their inner workings, they are about shaping perception and behaviour, not true comprehension. Yes, guardrails also reduce some classes of harm. 4) These problems are not with the user and their prompting. Stop blaming users for what are design flaws and systematic issues. 5) You cannot "swarm" your way out of these problems. Orchestration doesn’t solve fundamental epistemic limits. However, these systems (including agentic swarms) are extremely useful in the right context and are excellent for creating hypotheses (which then need to be tested). 6) These systems can output long, convincing “scientific” documents full of fabricated metrics, invented methods, and impossible conditions without flagging uncertainty. They cannot be trusted for policy, healthcare, or serious research, because they are far too willing to blur fact and fiction. 7) These systems can and should be used only as a drafting assistant (structuring notes, summarising papers) with all outputs fact-checked by humans that are capable in the field. Think of these systems as a calculator that sometimes “hallucinates” numbers - it should never be blindly trusted to do your tax return. 8) The persuasive but false outputs can cause real harm. These systems are highly persuasive and are designed to be this - hence coherence, the appearance of "helpfulness" and the use of authoritative language. 9) Being trained on market data, these systems exhibit large biases towards market benefit rather than societal benefit. Think of it like a little Ayn Rand on your shoulder whispering sovereign individual Kool-aid. In other words, the optimisation leans toward market benefit, not necessarily public good. So, yes ... these tools are great fun and can be useful. But apply critical thinking always. Review the output in detail. -- Appendix Many use the term hallucination as "error from reality". This implies that the LLM/LMM reasons its way to the correct answers. I take a position that all output is "hallucinated" and sometimes that output matches reality where we have lots of training data and narrow contexts. I feel this fairly reflects the more statistical nature of LLM/LMMs as we haven't built reasoning engines ... yet.

  • No alternative text description for this image

To view or add a comment, sign in

Explore content categories