← All insights
Explainers·8 min read

What are embeddings and vector search? How AI finds meaning

An embedding turns a piece of text into a long list of numbers that captures its meaning. Get that one idea and the rest follows: it is what lets a computer find things by meaning instead of by exact words, and it quietly powers most of the AI search you are about to want.

An embedding is meaning, written as numbers

When you read “annual leave” and “holiday entitlement,” you know they point at the same thing even though they share no words. A computer matching letters does not. An embedding fixes that. It runs a piece of text - a sentence, a product description, a paragraph from a policy - through a model that outputs a list of numbers, often a few hundred to a couple of thousand of them. That list is the text’s fingerprint of meaning.

The useful property is simple: text with similar meaning gets similar numbers. So the fingerprint for “annual leave” lands close to the one for “holiday entitlement,” and far from the one for “fire safety.” The computer still does not understand the words the way you do. It just compares lists of numbers, and that turns out to be enough to find things by meaning.

This is what people mean by vector search or semantic search. A vector is just the technical word for that list of numbers. OpenAI describes the pay-off plainly: semantic search surfaces results that are similar in meaning even when they share few or no matching words with what you typed.

Why an owner should care

Keyword search is brittle, and your customers feel it every day. Someone searches your shop for “jumper” and misses every product you labelled “sweater.” Someone asks your help centre “how do I get my money back” and finds nothing, because the page says “refund policy.” The match failed not because the answer was missing, but because the words did not line up.

Embeddings are the engine under the fixes for that:

  • An AI knowledge base over your own documents. Staff and customers ask a question in plain English and get the right passage from your SOPs, policies or product manuals, even when their wording is nothing like yours.
  • Smarter site and product search. Customers describe what they want the way they actually talk, and still land on the right product or page.
  • RAG (retrieval-augmented generation). The pattern where an AI answers from your material rather than from the open internet. Embeddings are the half that does the finding.

The thread connecting all three: people search in their own words, and meaning-based search keeps up where keyword search gives up.

How it works in practice

There are only two moving parts, and neither is mysterious once you see it.

One: turn your content into embeddings, once. You take the things you want searchable - every product, every help article, every chunk of a policy - and run each through an embedding model. Out comes a list of numbers for each. You do this up front, then again whenever the content changes.

Two: store them, then compare at search time. The lists go into a vector database - a store built to answer one question very fast: which saved items sit closest in meaning to this query? When someone searches, the system embeds their query into its own list of numbers, and the database finds the saved items whose numbers are nearest. “Nearest” really is distance: small distance means similar meaning, large distance means unrelated. Cloudflare’s primer describes a vector database as exactly this - a store that holds embeddings and returns the closest matches quickly enough to power live search across millions of items.

For RAG, one step is bolted on the end. Instead of just showing the closest passages, the system hands them to a language model and asks it to write a grounded answer. So an “AI that answers from your documents” is really vector search doing the finding and a model doing the writing.

When you actually need it

Embeddings are not for every job. You need them when:

  • People search your content in their own words and keyword matching keeps failing them - a product catalogue, a help centre, a document library.
  • You want an AI assistant that answers from your own material, since that needs retrieval underneath it.
  • The set of things to search is large enough that you cannot just paste it all into a prompt.

You do not need them when:

  • You have a handful of stable documents you can drop straight into a prompt.
  • The only lookups anyone does are exact-match - an order number, a SKU, an invoice ID. A normal database already nails that, faster and cheaper.

Start where the search pain is real and measurable, not because the technique sounds clever.

Honest limits

  • Embeddings find; they do not judge. Vector search returns what is closest in meaning. If your underlying content is wrong or out of date, it will confidently surface the wrong thing. Garbage in, well-retrieved garbage out.
  • Quality depends on how you chop up the content. Feed it whole 50-page documents and matches get vague; chop too small and you lose context. Getting this “chunking” right is part of the build, not an afterthought.
  • It is an extra system to run. Embeddings have to be regenerated when content changes, and the vector database is one more thing to host and maintain. Worth it at scale, overkill for a dozen files.
  • Keywords still win sometimes. For exact codes, names and IDs, plain keyword match beats semantic search. The strongest builds often combine the two rather than picking a side.

The one-line version

An embedding turns text into a list of numbers that captures its meaning, so similar meaning lands on similar numbers and a vector database can find things by meaning rather than exact keywords. That is what powers an AI knowledge base, smarter site and product search, and RAG. Reach for it when people search your content in their own words and keywords keep letting them down - and skip it when a simple paste or an exact-match lookup already does the job.

Frequently asked questions

What is an embedding in plain English?
An embedding is a way of turning text into a list of numbers that captures what the text means. Each piece of text - a sentence, a product description, a paragraph from a policy - gets its own list. The clever part is that text with similar meaning ends up with similar numbers, so "annual leave" and "holiday entitlement" land close together even though they share no words. The computer never understands the words the way you do. It just compares the numbers, and that turns out to be enough to find things by meaning.
How is vector search different from normal keyword search?
Keyword search matches the exact words you typed. Search for "jumper" and you will miss every product labelled "sweater." Vector search, also called semantic search, compares meaning instead. It turns your query into an embedding, then finds the stored items whose embeddings sit closest to it. OpenAI describes this as surfacing semantically similar results even when they match few or no keywords. In practice that means a customer can describe what they want in their own words and still find it.
What is a vector database and do I need one?
A vector database is a store built to hold embeddings and answer one question very fast: which stored items are closest in meaning to this query? You need one once you have more than a handful of items to search - hundreds of products, thousands of documents. For a small, fixed set you can get away with simpler tools. Most real builds use a dedicated vector database (or a vector feature bolted onto a database you already run) so search stays quick as the content grows.
How do embeddings relate to RAG and a knowledge base?
RAG (retrieval-augmented generation) is the pattern where an AI answers using your own documents. Embeddings and vector search are the retrieval half of it. When someone asks a question, the system embeds the question, uses the vector database to pull the passages whose embeddings are closest, and hands those passages to the AI to write a grounded answer. So an AI knowledge base over your SOPs and policies is really vector search doing the finding and a language model doing the writing.
When does an SMB actually need embeddings?
When people search your content in their own words and keyword matching keeps failing them - a product catalogue, a help centre, a document library. Also when you want an AI assistant that answers from your material, since that needs retrieval underneath. You do not need embeddings for a handful of stable files you can just paste into a prompt, or when exact-match lookup (an order number, a SKU) is all anyone does. Start where the search pain is real and measurable.

Where this fits

Custom Builds

Bespoke web apps, internal tools and AI products built on Claude and the Anthropic SDK.