Ramakrishna
// light mode is a myth here
← All work
AI Project

pgvector RAG Service

2026 · Python · pgvector · FastAPI
[ project cover / screenshot ]

Overview

A retrieval augmented generation service over a document corpus, exposed as a streaming API. Answers are grounded in retrieved passages and gated by an eval suite before deploy.

Approach

  • Embeddings and similarity search kept in Postgres via pgvector.
  • Streaming responses so the client sees tokens as they generate.
  • Eval gating on answer quality and grounding before any deploy.

Outcome

Placeholder for real numbers once published: retrieval quality at [X], median response start under [Y] ms.

PythonpgvectorFastAPI