RAG (Retrieval-Augmented Generation) and MFO Server¶

The MFO Server implements RAG (Retrieval-Augmented Generation) through a dedicated rag provider module. Here’s how it works:

🔍 How It Manages RAG¶

1️⃣ Document & Embeddings Storage:

The system has two key PostgreSQL tables:
rag.documents: stores metadata and full content of documents (e.g., user_id, collection, tags, content).
rag.embeddings: stores individual text chunks along with their embedding vectors (linked via document_id).

2️⃣ Chunking & Embedding:

When you add a document:
It is chunked (split into parts).
Each chunk is processed by an LLM provider to generate embeddings.
Both document metadata and embeddings are saved (bulk insert using pgvector).

3️⃣ Search (Vector Similarity):

The rag.search tool:
Converts the query into an embedding.
Runs a vector similarity search using PostgreSQL’s pgvector extension.
Returns the best-matching chunks based on cosine similarity.

4️⃣ Multi-Tenant Design:

All RAG operations are scoped by user_id and collection to keep data isolated and organized.

🛠️ Key Tools¶

Tool	Description
`rag.add_document`	Adds a single document (with chunking & embedding).
`rag.add_documents`	Adds multiple documents in batch.
`rag.search`	Searches for relevant chunks using vector similarity.

Example: The search tool parses parameters like query, user_id, collection, and optional tags, and returns results ranked by similarity.

⚙️ LLM & Config¶

LLM Provider: The system uses a pluggable LLM manager to generate embeddings (e.g., OpenAI’s text-embedding-ada-002 with 1536 dimensions).
Configuration: The embedding dimension is loaded from environment variables (e.g., RAG_EMBEDDING_DIMENSION).

🔄 Query Flow Example¶

1️⃣ Input: A user submits a query with user_id = 42, collection = "support_docs", and query = "How to reset password?".

2️⃣ Processing:

The query is embedded.
A similarity search finds top 5 chunks in "support_docs".

3️⃣ Output:

Returns chunks with similarity scores, ready to be fed into an LLM for final generation.

🗂️ Data Schema (Simplified)¶

erDiagram
    DOCUMENTS {
        UUID id PK
        STRING user_id
        STRING collection
        TEXT content
        TEXT[] tags
    }
    EMBEDDINGS {
        UUID id PK
        UUID document_id FK
        TEXT chunk
        VECTOR embedding
    }
    DOCUMENTS ||--o{ EMBEDDINGS : contains

✅ Special Features¶

Bulk Inserts: Uses pgx.CopyFrom for efficient embedding storage.
Full-Text + Vector Search: Although focused on vector similarity, a content_tsvector column enables optional full-text search.
Extensible Tools: Easy to add tools via adapter.go to expose new RAG functionalities.

In summary, the MFO Server’s RAG system provides a full pipeline from document ingestion to retrieval and augmentation, with tight integration between PostgreSQL (via pgvector), LLMs, and Go-based tooling.