RAG (Retrieval-Augmented Generation) and MFO Server¶
The MFO Server implements RAG (Retrieval-Augmented Generation) through a dedicated rag
provider module. Here’s how it works:
🔍 How It Manages RAG¶
1️⃣ Document & Embeddings Storage:
-
The system has two key PostgreSQL tables:
-
rag.documents
: stores metadata and full content of documents (e.g.,user_id
,collection
,tags
,content
). rag.embeddings
: stores individual text chunks along with their embedding vectors (linked viadocument_id
).
2️⃣ Chunking & Embedding:
-
When you add a document:
-
It is chunked (split into parts).
- Each chunk is processed by an LLM provider to generate embeddings.
- Both document metadata and embeddings are saved (bulk insert using
pgvector
).
3️⃣ Search (Vector Similarity):
-
The
rag.search
tool: -
Converts the query into an embedding.
- Runs a vector similarity search using PostgreSQL’s
pgvector
extension. - Returns the best-matching chunks based on cosine similarity.
4️⃣ Multi-Tenant Design:
- All RAG operations are scoped by
user_id
andcollection
to keep data isolated and organized.
🛠️ Key Tools¶
Tool | Description |
---|---|
rag.add_document | Adds a single document (with chunking & embedding). |
rag.add_documents | Adds multiple documents in batch. |
rag.search | Searches for relevant chunks using vector similarity. |
Example: The search
tool parses parameters like query
, user_id
, collection
, and optional tags
, and returns results ranked by similarity.
⚙️ LLM & Config¶
- LLM Provider: The system uses a pluggable LLM manager to generate embeddings (e.g., OpenAI’s
text-embedding-ada-002
with 1536 dimensions). - Configuration: The embedding dimension is loaded from environment variables (e.g.,
RAG_EMBEDDING_DIMENSION
).
🔄 Query Flow Example¶
1️⃣ Input: A user submits a query with user_id = 42
, collection = "support_docs"
, and query = "How to reset password?"
.
2️⃣ Processing:
- The query is embedded.
- A similarity search finds top 5 chunks in
"support_docs"
.
3️⃣ Output:
- Returns chunks with similarity scores, ready to be fed into an LLM for final generation.
🗂️ Data Schema (Simplified)¶
erDiagram
DOCUMENTS {
UUID id PK
STRING user_id
STRING collection
TEXT content
TEXT[] tags
}
EMBEDDINGS {
UUID id PK
UUID document_id FK
TEXT chunk
VECTOR embedding
}
DOCUMENTS ||--o{ EMBEDDINGS : contains
✅ Special Features¶
- Bulk Inserts: Uses
pgx.CopyFrom
for efficient embedding storage. - Full-Text + Vector Search: Although focused on vector similarity, a
content_tsvector
column enables optional full-text search. - Extensible Tools: Easy to add tools via
adapter.go
to expose new RAG functionalities.
In summary, the MFO Server’s RAG system provides a full pipeline from document ingestion to retrieval and augmentation, with tight integration between PostgreSQL (via pgvector
), LLMs, and Go-based tooling.