Capabilities

Everything you need for private document intelligence

A complete, self-hosted pipeline from raw financial PDFs to a governed, conversational knowledge base.

Ingestion & parsing

Advanced OCR

Layout-aware OCR via Qwen3-VL-8B for scanned & mixed-orientation PDFs.

Table understanding

Nested, merged-cell and page-spanning financial tables.

Distributed pages

Page-level batching & resumable parsing for 10k-page files.

Fault tolerance

Retry failed batches, isolate corrupted pages, partial ingestion.

Retrieval & RAG

pgvector storage

Semantic vectors in PostgreSQL with sub-second retrieval.

Streaming RAG

Token streaming with inline source citations.

Scoped retrieval

Single file, selected files, or entire workspace.

Session memory

Short-term continuity plus semantic historical retrieval.

Governance & ops

Enterprise RBAC

Super-admin, admin, user roles with strict isolation.

Realtime audit

Live activity, ingestion, config and queue events.

Workspaces

Isolated workspaces with folder hierarchy & grouping.

On-prem ops

One-command Docker Compose, air-gapped deployable.