End-to-end processing, without compromise.
From document ingestion to AI-powered answers, every step is designed for security, compliance, and transparency.
1,254 automated tests. 27 independent modules. 25 abstraction interfaces. 0 proprietary dependencies in the certified core.
- 1
Document ingestion
Batch or single-file processing. PDF, Excel, PPTX, HTML, Markdown with automatic metadata extraction.
- 2
Personal data protection
Automatic detection and masking of sensitive data — names, phone numbers, bank accounts, emails — before any AI processing.
- 3
Data chunking
Recursive, hierarchical chunking with parent-child relationships for precise, context-aware retrieval.
- 4
Semantic enrichment
Vector indexing and data contextualization for search that understands meaning, not just keywords.
- 5
Knowledge graph extraction
Automatic entity and relationship extraction for complex queries across your entire document corpus.
- 6
SHA-256 cryptographic trail
Every pipeline operation is cryptographically signed. Fully exportable for compliance reviews.
- 7
Full auditability
Every ingestion, query, and AI response is logged and traceable. Ready for internal and regulatory audits.
- 8
Ferrocene compiler certification
Core compatible with the Ferrocene certified Rust compiler, qualified for safety-critical environments under IEC 62304 and ISO 26262 standards.
- 9
Hexagonal Architecture
25 abstract port traits decouple the business domain from all external dependencies. Adapters point inward to the core — never the reverse.
- 10
Native Rust, Zero Third-Party Runtime
Pipeline compiled to a single binary with no Python and no JVM. MSRV 1.75, stable toolchain.
- 11
Typed Blackboard Pipeline
Strongly-typed PipelineContext — no HashMap. Dependencies between stages are validated at assembly time, not at runtime.
- 12
Hybrid Mode: Linear / Graph / Multi-Query
Three retrieval modes on the same engine: linear RAG, GraphRAG (Oxigraph RDF triplestore), and multi-query fusion via Reciprocal Rank Fusion.
- 13
Integrated GraphRAG
Entity extraction, persistent RDF storage, and multi-hop traversal directly within the ingestion pipeline.
- 14
SHA-256 Audit Chain
Each stage emits StageStarted / StageCompleted / StageFailed events, forming a cryptographically verifiable chain.
- 15
PII Filter
PiiStage insertable into any pipeline. Regex-based detection and masking of personal data before indexing or generation.
- 16
Native Parser — No Python
HTML, Markdown, Excel, PPTX, and PDF parsed in pure Rust via vectrant-adapter-parser-native. No external parsing service.
- 17
Decorative LRU Cache
CachedEmbeddingModel and CachedLLMEngine wrap any adapter transparently without modifying the core.
- 18
Vendor-Agnostic Cloud Adapters
OpenAI, Anthropic, Ollama, Cohere, and pgvector registered via feature flags. API keys are never stored — only the environment variable name.
- 19
Input / Output Guardrails
InputGuardrail and OutputGuardrail ports for prompt injection detection, toxicity filtering, and policy violation enforcement.
- 20
Document Access Control
AccessControl port with RBAC/ABAC to filter retrieval results by user or role without exposing restricted content.
- 21
RAGAS-Like Quality Evaluation
QualityEvaluator port measuring faithfulness, answer relevance, context precision, and context recall in the post-pipeline phase.
- 22
Lifecycle Metrics and Hooks
PipelineHooks (per-stage callbacks) and PipelineMetrics (aggregated timing) without any intrusion into business logic.
- 23
27 Crates Workspace, 1,147+ Tests
Unit, integration, and end-to-end coverage. Zero unwrap() / panic!() / unsafe in production code, enforced by automated tests.