Skip to content

HyDE (Hypothetical Document Embeddings)

A RAG technique where the LLM first generates a hypothetical answer to embed, which is then used to retrieve real documents from the vector database.

HyDE — Hypothetical Document Embeddings — is a clever retrieval technique that improves search by having a language model first write a hypothetical answer to the query, then using that answer’s embedding to search, rather than the embedding of the original question.

The insight behind it is that questions and answers often look quite different in embedding space. A short, terse query may not sit near the documents that actually answer it, because those documents are written in the style of answers, not questions. By asking a model to generate a plausible answer first — even one that may contain inaccuracies — you produce text that resembles a real document, whose embedding lands much closer to the genuinely relevant material.

In practice, HyDE adds a generation step before retrieval: the model drafts a fake answer, that draft is embedded, and the vector database finds real documents similar to it. The retrieved real documents are then used to produce the final grounded answer. HyDE can noticeably improve retrieval on hard or sparse queries, at the cost of an extra model call per search.