How vector databases work - and why Google is one

What is a word embedding?

An embedding is a way of turning a word - or a sentence, or a whole document - into a list of numbers. Not a random list: a list where similar meanings end up as similar numbers.

The classic example: if you train a model on enough text, the vector for king minus the vector for man plus the vector for woman lands you almost exactly on the vector for queen. The model has never been told what royalty means or that gender exists - it has worked out the relationship from patterns in billions of sentences.

Each word gets somewhere between a few hundred and a few thousand numbers. Together those numbers form a coordinate in a very high-dimensional space. “Similar meaning” just becomes “close together in that space”.

Newer models don't stop at words - they embed full sentences and paragraphs, capturing context. The sentence “I need a plumber in the east” and “emergency pipe repair Singapore” end up near each other even though they share no words.

How similarity becomes a number

Cosine similarityThe most common measure. Think of each embedding as an arrow pointing from the origin. Cosine similarity measures the angle between two arrows - 1.0 means identical direction (same meaning), 0 means unrelated, −1 means opposite.

Dot productSimilar to cosine similarity but also accounts for the magnitude of each vector. Used when the model is trained to make more relevant content produce longer vectors.

Euclidean distanceStraight-line distance between two points in the embedding space. Less common for text but used in image search and some recommendation systems.

What a vector database actually does

A traditional database stores rows and columns and answers questions like “give me every row where city = Singapore”. It matches exact values.

A vector database stores embeddings and answers a different kind of question: “give me the 10 embeddings closest to this query embedding”. It matches meaning.

The engineering challenge is speed. If you have 50 billion documents (roughly Google's scale), comparing a query vector against every stored vector one by one would take hours. Vector databases use indexing structures - particularly Approximate Nearest Neighbour (ANN) algorithms like HNSW and IVF-PQ - to find the closest matches in milliseconds by navigating a graph of pre-computed neighbours rather than doing a full scan.

Purpose-built vector databases like Pinecone, Weaviate, Qdrant, and Milvus are newer products. But the underlying idea - index by embedding, retrieve by similarity - has been running at Google scale for over a decade.

Google's search index is a vector database

In 2019 Google announced that BERT - a transformer model trained on the full text of Wikipedia and billions of web pages - was being applied to Search. For the first time, Google could understand the full context of a query, not just the individual words in it.

Behind the scenes, BERT (and its successors like MUM and the unnamed models Google runs today) convert both the search query and the content of indexed pages into embeddings. Retrieval is partly a vector similarity problem: find the pages whose embeddings sit closest to the query embedding.

Google calls part of this system neural matching - a ranking signal that uses embeddings to understand when a page is about a topic even if it doesn't contain the exact words from the query. A page about “how to fix a dripping tap” can now rank for “leaking faucet repair” because the two phrases live close together in embedding space, not because someone stuffed one phrase into the other.

Google's infrastructure team published research on ScaNN (Scalable Nearest Neighbour search) - their own ANN library, optimised for the hardware they run - which confirms that approximate vector search is a first-class part of how they operate at scale. It is not an analogy; Google is literally running vector similarity at retrieval time.

The journey from your page to a search result: Googlebot crawls your page → the text is passed through a neural encoder → the output is a dense vector of ~768 or more numbers → that vector is stored in Google's index → at query time, the user's question is encoded by the same model → Google finds the nearest vectors → those pages are candidates for ranking. Traditional signals (links, authority, freshness) then re-rank the candidates. The semantic match gets you into the room; the traditional signals decide your position.

What this changes about how you write for search

Keyword density is mostly irrelevant. Repeating a phrase 14 times doesn't shift your embedding towards the query - it just makes the page unpleasant to read. What shifts your embedding is writing thoroughly and clearly about the topic, using the full vocabulary of the domain.

Semantic coverage matters. A page about “accounting software for small business” should naturally mention invoicing, tax, reconciliation, payroll - not because you're targeting those exact phrases, but because a thorough page on the topic will include them and the resulting embedding will sit in the right neighbourhood.

Structured data anchors the embedding. When you add Schema.org JSON-LD declaring your organisation, services, and FAQs, you give Google unambiguous facts that constrain how it interprets your page. The structured data doesn't directly change the embedding, but it reduces ambiguity about which cluster of the index your page belongs to.

AI answer engines are the same system. Perplexity, ChatGPT with search, and Google's AI Overviews all retrieve candidate documents using vector search before summarising them. If your page doesn't land in the nearest-neighbour results for the query embedding, it will never be cited - however good the prose is.

What to do with this in practice

Write for topics, not keywords. Think about the full concept you want to own - what questions surround it, what adjacent ideas belong to it - and cover those honestly. The embedding follows from the content.
Use real vocabulary from the domain. If you write about plumbing, use the words plumbers use. If you write about accounting, use the terms accountants reach for. The semantic cluster you land in is defined by your vocabulary.
Add structured facts. Schema.org markup (Organisation, Service, FAQ, Article) gives the crawler high-confidence signals alongside the prose. It doesn't replace the embedding but it disambiguates it.
Answer the full question, not just the keyword.A query embedding encodes the user's intent - what they want to do, not just the words they typed. A page that satisfies the intent will land closer in vector space than one that matches the surface string.
Earn links from contextually related pages.Inbound links from pages in the same semantic neighbourhood reinforce your position in the index - not just as a PageRank signal, but as confirmation that authoritative pages on the topic point to yours.

Part of our AI SEO series. This article connects to how we approach search and AI-era visibility at Web Wizards. Read the full AI SEO strategy to see how structured data, intent-led content, and vector-aware writing come together in a single engagement.

Want your pages to land in the right neighbourhood?

Vector search is not something you can trick - but you can work with it deliberately. Tell us about your site and the topics you want to own; we'll look at where you sit in the semantic landscape and what it would take to move.

Talk to us