Skip to content

Zero-shot

The ability of a model to perform a task it was not explicitly trained on, relying on generalisation from pre-training rather than task-specific examples.

Zero-shot refers to a model’s ability to perform a task it was never explicitly trained on, relying on the general capabilities it acquired during pre-training rather than on task-specific examples. The model generalises to new situations with no additional training and no examples of the task provided in advance.

This property is central to why modern embedding and language models are so useful. A general-purpose embedding model can produce meaningful vectors for text in a domain it was never specifically trained on, and a language model can answer questions or follow instructions of kinds it never directly practised. Zero-shot performance is what lets these models be applied broadly without bespoke training for every new use case.

For vector search, zero-shot capability means an off-the-shelf embedding model can often power semantic search over your data with no fine-tuning, producing useful results immediately. When zero-shot quality is not sufficient for a specialised domain, fine-tuning on domain examples can improve it, but the ability to start with zero-shot performance dramatically lowers the barrier to building retrieval systems.