AI Security by Design | Air-Gapped RAG in Rust | Lexiane
AI security built into the architecture: Rust memory safety, single binary, SHA-256 audit chain, OWASP LLM Top 10. On-premise RAG with zero exploitable attack surface.
The security of an AI system in production cannot be reduced to a list of measures applied after deployment. It is the consequence of a set of architectural decisions made upfront — on the language, on system boundaries, on memory management, on exposure surfaces. These decisions determine what an attacker can reach, what the system can leak, and what an auditor can verify.
Lexiane was designed with this logic: every security property is either mechanically guaranteed by the compiler, verified by an automated test in continuous integration, or rendered physically impossible by the architecture. No security properties rely solely on development conventions or on the individual vigilance of teams.
The threat landscape specific to RAG systems
RAG systems introduce attack vectors that do not exist in traditional software systems. OWASP published in 2023, updated in 2025, a reference framework of ten priority risk categories for LLM applications — the OWASP LLM Top 10. Among the most critical for a document processing system:
LLM01 — Prompt injection. An attacker inserts instructions into a query designed to bypass system policies, exfiltrate data, or produce non-compliant content. In a RAG system exposed to not-fully-trusted users — external agents, internal users with limited rights — this vector is the most frequently exploited.
LLM06 — Sensitive information disclosure. The language model can produce responses incorporating data present in the retrieved context, including personal data or confidential information that the requesting user should not see. In a multi-user corpus, the absence of access control at the retrieval level creates this risk systematically.
LLM02 — Insecure output handling. LLM outputs are used without sufficient validation — in web interfaces (XSS), in system calls, or forwarded to other systems without filtering. LLM outputs are untrusted data and must be treated as such.
LLM09 — Overreliance on model outputs. A system that accepts and transmits LLM responses without verifying their grounding in documentary sources opens the door to operational misinformation — false responses presented with the same apparent reliability as correct ones.
To these LLM-specific threats are added classic threats amplified by the complexity of AI stacks: memory vulnerabilities introduced by C/C++ dependencies underlying Python frameworks, software supply chain attacks, and an expanded attack surface from secondary processes and network calls.
Lexiane addresses each of these categories — through distinct, verifiable, and documented mechanisms.
Memory safety: what Rust structurally excludes
The class of vulnerabilities that Python and C++ cannot eliminate
Memory management vulnerabilities — buffer overflow, use-after-free, double free, null pointer dereference, memory race conditions — have historically constituted the majority of critical security vulnerabilities in software systems. Google documented on the Android project that adopting Rust as a system language reduced memory-related vulnerabilities by 68% over five years. Microsoft estimated that 70% of critical CVEs in its products originate from memory management errors.
Python RAG frameworks rely on C and C++ libraries for their intensive operations — PyTorch, NumPy, native parsers, compression libraries. These dependencies inherit all the vulnerability classes of non-memory-safe languages. A vulnerability in a transitive dependency — invisible in the surface Python code — can compromise the entire system.
What the Rust compiler guarantees
Rust eliminates by design, at the compiler level, the entire categories of memory bugs mentioned above:
- No null pointer dereferences — optional values are expressed via
Option<T>, whose exhaustive handling is required by the compiler. - No use-after-free — Rust’s ownership system guarantees that a freed resource can no longer be accessed. This property is verified statically, without dynamic analysis.
- No data races — Rust’s type system guarantees mutual exclusion between concurrent mutable accesses. A data race is a compilation error, not a race condition discovered in production.
- No uncontrolled arithmetic overflow — in debug mode, overflows are detected and panic. In release mode, behavior is defined and configurable — never exploitable undefined behavior.
#![forbid(unsafe_code)] on the certified perimeter
Rust allows writing “unsafe” code — blocks where compiler guarantees are suspended, necessary for interacting with C interfaces or the system. This feature is a trap: in a codebase without discipline, it can reintroduce the vulnerabilities that Rust is supposed to eliminate.
Lexiane’s certified kernel (vectrant-core) carries the #![forbid(unsafe_code)] directive. This directive is not a convention — it is enforced by the compiler. No developer can introduce an unsafe block into the certified perimeter, even inadvertently, even under deadline pressure. The compiler rejects the code before the binary exists.
In December 2023, CISA (Cybersecurity and Infrastructure Security Agency) published “The Case for Memory Safe Roadmaps”, explicitly recommending the adoption of memory-safe languages — including Rust — for critical systems. The U.S. Department of Defense and the White House issued convergent recommendations. Lexiane satisfies these guidelines by architecture.
Attack surface reduction
One binary, not an ecosystem
Every external dependency is a potential attack vector. A Python system in production exposes an extensive attack surface: the Python interpreter, packages installed via pip, shared system libraries loaded dynamically, secondary processes (Ollama server, Redis API, Celery worker). An attacker who compromises a dependency — through a zero-day vulnerability or a supply chain attack — can compromise the entire system.
Lexiane compiles into a self-contained static binary. It does not load shared libraries at runtime. It does not resolve dependencies at startup. It does not download external code during execution. The attack surface is delimited by the binary itself — a fixed, auditable, reproducible perimeter.
Supply chain security
Software supply chain attacks — compromising an open-source package to inject malicious code into dependent applications — are constantly increasing. The PyPI index has suffered numerous such incidents in recent years.
Lexiane addresses this risk through two complementary mechanisms:
Zero vendor dependencies in the certified kernel. The vectrant-core module — the system’s core — contains no dependencies toward any third-party vendor. This constraint is verified by an automated test that fails at compilation if an external dependency is introduced. Regression is impossible without being immediately detected.
Compilation with Ferrocene. Ferrocene is the qualified version of the Rust compiler developed by Ferrous Systems — qualified ISO 26262 ASIL D and IEC 61508 SIL 4. Compiling Lexiane with Ferrocene establishes a complete chain of trust for certification: from source code, to the compiler, to the deployed binary. An attacker wishing to introduce malicious code would need to compromise the compiler itself — whose qualification imposes rigorous traceability.
No secondary processes
Alternative Rust RAG frameworks — Swiftide, Rig — require an external inference server for text generation: an Ollama process, an OpenAI endpoint, or a vLLM server. This secondary server is an independent network component: its attack surface adds to that of the RAG framework, and communications between the two components create a potentially interceptable channel.
Lexiane embeds LLM inference (Mistral.rs) and embeddings (Candle) in the same binary. In local configuration, there is no secondary process, no internal network communication, no inter-process channel to secure.
AI-specific security
Defense against prompt injection
Lexiane’s InputGuardrail port implements a layer of validation of incoming requests before they reach the retrieval pipeline or the language model. Prompt injection patterns — hidden instructions in the request, attempts to bypass system policies, privilege escalation through prompt manipulation — are detected and blocked at this stage.
This mechanism operates upstream of the pipeline: a request blocked by the InputGuardrail generates no embedding, triggers no retrieval, and solicits no LLM. There is no computational cost associated with blocked malicious requests, and no risk that their content influences model behavior.
Model output control
The OutputGuardrail port validates the produced response before transmission to the user. Three categories of risk are addressed:
Sensitive data leakage. The LLM can incorporate into its response data present in the retrieved context — including personal data the user should not have seen, or confidential information from documents accessible via the retrieval mechanism. The OutputGuardrail detects these leaks and blocks them before transmission.
Toxic or non-compliant content. Responses that violate content policies defined by the organization — discriminatory content, dangerous instructions, out-of-scope content — are intercepted before reaching the user.
Ungrounded responses. The FaithfulnessChecker verifies that the produced response is effectively supported by the retrieved sources. A response that extrapolates beyond the provided context — that is, a hallucination — is detected and can trigger an abstention or a flag.
Document-level access control
The AccessControl port implements filtering of retrieval results based on the requesting user’s rights. This filtering operates before generation: documents the user does not have access to are not transmitted to the LLM as context.
This position in the pipeline is critical. Access control applied only at the interface — masking certain responses in the UI — leaves sensitive data passing through the language model. An LLM that has received a confidential document in its context can reveal its content indirectly, even if the final response seems not to reference it. Lexiane cuts this vector at the source.
Two access control models are supported:
- RBAC (Role-Based Access Control) — access rights are defined by the user’s role in the organization.
- ABAC (Attribute-Based Access Control) — access rights are defined by contextual attributes: document classification, owning department, sensitivity level, publication date.
The relevance gate as a defense against operational misinformation
RelevanceGateStage evaluates the overall confidence score of the retrieved context before generation. Below the configured threshold, the system refrains from generating a response and signals this explicitly.
This abstention behavior is an operational security measure: in a decision-making environment — medical, legal, industrial, intelligence — a poorly grounded response presented with the same appearance as a reliable one is more dangerous than no response. Lexiane prefers transparency about insufficiency over producing an unsupported response.
Data protection
The PII filter as a first-line barrier
The PII filter operates upstream of the entire document processing pipeline — before any embedding, before any indexing, before any LLM call. Personal data detected in ingested documents is processed according to configurable policies: typed masking, deletion, cryptographic hashing.
The audit trail records, for each document, the categories of personal data detected and the policies applied. This register constitutes technical proof of compliant processing — exploitable in the context of a GDPR audit.
Data residency as a physical guarantee
In air-gapped configuration, data does not leave the perimeter because the architecture makes it physically impossible — not because a contract prohibits it. The binary has no active network interface toward the outside. There are no flows to monitor, no contractual commitment to audit: the property is mechanical.
This distinction — architectural guarantee versus contractual guarantee — is fundamental for organizations whose data is subject to strict localization requirements: defense, intelligence, healthcare, public sector subject to sovereign cloud reference frameworks.
Audit and forensics
An inviolable SHA-256 chain
Every pipeline action is recorded in a cryptographic audit chain: each entry is signed by the SHA-256 hash of the previous one. Any retrospective modification of a record — deletion of an access, alteration of a timestamp, modification of an applied policy — breaks the chain and is mathematically detectable.
In the event of a security incident, this chain enables complete forensic reconstruction: who accessed what, at what time, with what result, according to what filtering policy. The investigation does not depend on the availability of application logs that might have been altered or deleted. The chain is the proof.
Recorded events:
| Event | Recorded data |
|---|---|
| Document ingested | Identifier, content hash, timestamp, collection |
| Personal data detected | Categories, policy applied, position in document |
| User query | User identifier, query text, timestamp |
| Documents retrieved | Identifiers, relevance scores, access granted or denied |
| Response produced | Response hash, cited sources, faithfulness score |
| Guardrail triggered | Guardrail type, blocking reason, user identifier |
No data leakage through logs
A commonly overlooked data leakage vector in AI systems is application logging: sensitive data present in requests or responses that ends up in plaintext in system logs. Lexiane prohibits by development convention the use of println!() in all production code. Any log emission goes through the tracing framework, with controllable verbosity levels and structuring filters. This property is verified in continuous integration.
What your security team can verify
Lexiane’s security does not rest on assertions. Every property is either verifiable in the source code, verifiable in build artifacts, or demonstrable by inspection of the binary.
| Security property | Verification mechanism |
|---|---|
| No unsafe code in the kernel | #![forbid(unsafe_code)] — source code inspection |
No unwrap() / panic!() in production | Automated test in CI — result verifiable in the test suite |
| Zero vendor dependencies in the kernel | Automated test at compilation — vectrant-core/Cargo.toml inspectable |
| Audit trail integrity | SHA-256 cryptographic property — algorithmically verifiable |
| PII filtering before indexing | Position in the pipeline — inspectable in the ingestion stage order |
| Access control before generation | Position of the AccessControl port in the pipeline — inspectable in the assembly |
No println!() | Verifiable by grep across the entire source code |
| Static configuration validation | Assembler property — inspectable in assembly tests |
| No network calls in air-gapped mode | Inspectable in the active adapter configuration |
Frequently asked questions from security teams and CISOs
Have you conducted a pentest on Lexiane? The results of an external security audit are available on request in the context of a qualified discussion. The mechanical properties described in this document — no unsafe code, attack surface reduced to the binary, inviolable audit chain — are independently verifiable by your security team or by an auditor of your choice.
Is Lexiane compliant with the OWASP LLM Top 10?
Lexiane’s architecture directly addresses categories LLM01 (prompt injection via InputGuardrail), LLM02 (output validation via OutputGuardrail and FaithfulnessChecker), LLM06 (PII filtering and AccessControl before generation), and LLM09 (relevance gate and abstention). A complete OWASP compliance assessment for your specific deployment remains to be conducted according to your context.
How should secrets — API keys, tokens — be managed in the configuration?
Lexiane follows the api_key_env convention: configuration fields reference the name of the environment variable containing the key, never the key itself. Secrets do not transit through TOML configuration files. They are injected via environment variables at startup — compatible with standard secret managers (Vault, Kubernetes Secrets, AWS Secrets Manager).
What is the dependency and CVE management policy?
Project dependencies are subject to automated security auditing (cargo audit) in continuous integration. CVEs detected in dependencies trigger an immediate alert. The number of active dependencies is kept as low as possible — every added dependency is a deliberate decision, not a side effect.
Can Lexiane be deployed in an environment with a web application firewall (WAF)? Yes. Lexiane’s REST API is a standard HTTP API, compatible with all WAFs on the market. In air-gapped configuration, the WAF question applies only to internal flows — Lexiane itself initiates no outbound network flows.
How can access rights be isolated between multiple teams using the same Lexiane deployment?
The AccessControl port enables segmentation of rights at the document level. Ingested documents can be tagged with classification attributes (RBAC or ABAC), and access rights are verified at retrieval — before context is transmitted to the model. Multiple teams can share the same infrastructure without their document corpora mixing in responses.
Start the conversation about your threat model.
The security of an AI system is built on a threat analysis specific to your context: your sector, your user access model, your deployment infrastructure, and your regulatory requirements. We do not offer generic assessments.
We offer a structured exchange with a team that knows the specific attack vectors of RAG systems, the constraints of regulated environments, and the evidence that Lexiane’s architecture can provide to your security teams.
What you can expect:
- A response within 48 business hours
- A technical contact who knows the OWASP LLM Top 10, the constraints of classified environments, and certification requirements
- An honest mapping of what Lexiane’s architecture covers — and what remains your responsibility
→ Contact us
No commercial commitment. A security discussion.
References cited in this document: — OWASP LLM Top 10, version 1.1 (2023), updated 2025 — owasp.org/www-project-top-10-for-large-language-model-applications — CISA, “The Case for Memory Safe Roadmaps”, December 2023 — Google, “Memory Safe Languages in Android 13”, Android Security Blog — NIST AI Risk Management Framework 1.0, January 2023 — Regulation (EU) 2016/679 (GDPR)
Request access to the Auditable Core
Sign up to be notified when our Core audit programme opens. In accordance with our privacy policy, your professional email address will be used exclusively for this technical communication, with no subsequent marketing use. Access distributed via secure private registry.
Contact us