Skip to content

Hallucination

The tendency of LLMs to generate plausible-sounding but factually incorrect information not supported by their training data or retrieved context.

A hallucination is when a language model produces information that sounds confident and plausible but is actually false or unsupported. Because these models generate text by predicting likely continuations rather than by looking up facts, they can fluently state things that are simply wrong — inventing citations, statistics, or details that never existed.

Hallucination is one of the central challenges in deploying language models for anything where accuracy matters. The model has no built-in sense of what it does and does not know; it fills gaps with statistically likely text, which can be indistinguishable in tone from a correct answer. This makes hallucinations particularly dangerous, since they do not announce themselves.

Retrieval-augmented generation is the primary defence. By retrieving relevant, authoritative context from a vector database and instructing the model to ground its answer in that evidence, you give it real facts to work from rather than leaving it to improvise. Combined with techniques like asking the model to cite sources and to say when it does not know, grounding substantially reduces — though does not entirely eliminate — hallucination.