Embeddings turn meaning into vectors (last post). But if you have a million of them, how do you find the right ones for a query — fast? That’s what a vector database does, and it’s the retrieval engine behind every RAG app. Here’s a live semantic search demo.
🗂️ Search by meaning (not keywords): https://dev48v.infy.uk/ai/days/day14-vector-databases.html
Search becomes “find the nearest vectors”
Embed your query into the same space as your documents, then find the document vectors closest to it (by cosine similarity). Because closeness = meaning, the query “how do I reset my password” matches a doc about “recovering account access” — even with zero shared keywords. The demo shows this beating a keyword search that returns nothing.
Why you need a database, not a for-loop
Comparing your query to every vector (brute-force kNN) is fine for hundreds, hopeless for millions. Vector DBs use ANN (approximate nearest neighbour) indexes like HNSW to find the closest vectors in milliseconds — trading a tiny bit of accuracy for huge speed.
What a vector DB actually stores
Vectors + the original text + metadata, behind an ANN index. Pipeline: chunk your docs → embed → upsert. Query: embed the question → search top-k → (often) filter by metadata or combine with keyword search (hybrid).
This is the retrieval half of RAG. Real options: Pinecone, Weaviate, Chroma, pgvector, FAISS.
🔨 Build it (embed → upsert → similarity search → top-k → RAG) on the page: https://dev48v.infy.uk/ai/days/day14-vector-databases.html
Part of AIFromZero. 🌐 https://dev48v.infy.uk