Vector Database for AI Agents
A custom corporate retrieval layer that uses cosine similarity over high-dimensional embeddings so internal AI agents can pull contextually relevant company data in milliseconds.
What we set out to solve.
Internal AI agents at a venture firm were producing generic answers because they had no fast, structured access to portfolio data, prior diligence notes, and founder communications. SQL keyword search was too brittle, full-text search missed semantic matches, and shipping every query through a foundation model with the entire corpus in context was cost-prohibitive.
How we built it.
Embedding pipeline
Documents are chunked at semantic boundaries, embedded with a 1536-dim model, and stored with pgvector indexes. Backfills and incremental updates run as Supabase edge functions triggered by row changes.
Retrieval API
A thin TypeScript service exposes a similarity endpoint that runs cosine search with metadata filters (tenant, document type, date range). Results are reranked using a small cross-encoder for the top-K candidates only.
Agent integration
Agents receive a typed `retrieve()` tool. Latency budget under 120ms p95 keeps interactive flows snappy; results include source citations so downstream answers stay grounded.
The numbers.
What it changed.
The retrieval layer became the default data surface for every internal agent. Manual report-pulling time dropped from hours to seconds, and the firm now ships new agent capabilities by registering documents — not by hand-writing prompts.