Gnosis Document Intelligence — Rodjemel Faburada

The problem

Answers from LLMs over long documents are hard to trust: citations are often vague, hallucinated, or point at the wrong section. For document Q&A to be useful in serious work, every claim needs to be verifiable at a glance.

The approach

Asynchronous ingestion pipeline — documents are parsed, chunked and embedded by BullMQ workers, so uploads never block the UI.
Page-aware retrieval — chunks carry page metadata into Pinecone, so retrieval maps every passage back to its exact source page.
Citation enforcement — the model can only answer from retrieved passages, and each answer is post-validated against its sources.
Real-time streaming — responses stream over SSE with citations resolving live as the answer is generated.
Multi-tenant isolation — per-tenant namespaces keep one customer’s documents invisible to another.

The outcome

A production-grade RAG application that treats verifiability as a first-class feature rather than an afterthought — built, deployed and operated end-to-end as a solo project.