A layered architecture for running autonomous AI agents under deterministic safety constraints — a trusted, hard-coded harness that validates and bounds the output of a probabilistic model.
Status: An architectural proposal, not a validated system. There is no production deployment and no empirical benchmark data; any performance figures in this work are design targets, not measurements. See Status and Validation in the brief.
Full Technical Architecture Brief
The architecture applies Defense in Depth to agentic AI: four layers, each with a distinct job, working together to govern agent behavior from network isolation through to observability. The design intent is to let an agent operate with real autonomy while keeping the parts that must never drift — the safety floor — under deterministic, auditable control.
Isolates agentic activities using Sovereign Shards — TEE-backed execution environments that enforce jurisdictional data boundaries. Automated data auditing prevents raw or identifiable data from leaving the sovereign perimeter, and Federated Knowledge Synthesis lets a global model learn from regional patterns without raw data ever crossing a boundary.
Small Language Models (SLMs) at the edge provide low-latency decision-making, cross-validated through Sensor Fusion (e.g., video + ultrasound). If sensor streams disagree, the system defaults to a fail-safe state defined in the Harness. NeMo Guardrails and structured prompt engineering keep model output consistent and predictable.
A hard-coded validation layer that wraps every agentic action and enforces the rule set defined at deployment. The Harness turns probabilistic model output into deterministic, auditable decisions. It intercepts policy violations before execution, supports compliance hot-patching for regulatory changes, and logs a full decision lineage for audit.
Wraps the entire stack with observability that goes beyond error logging. It tracks Utility Decay — the signal that safety constraints have become restrictive enough to block real commercial value — and triggers a Re-Architecture Alert before the system stops earning its keep. It also monitors Reality Gap drift between the Digital Twin simulation and production telemetry.
The harness-over-model pattern is not new. It descends from the Simplex architecture (Sha, 2001), from run-time assurance as codified in ASTM F3269, and from shielding in reinforcement learning (Alshiekh et al., 2018). The output-validation mechanism is an instance of policy-as-code (Open Policy Agent) and of constrained-output tooling such as NeMo Guardrails.
Two things set this work apart from those classical formulations:
- It bounds catastrophic failure rather than guaranteeing safety. Classical Simplex and RTA assume a formally verified safety controller and a computable safe-recovery region. Those assumptions break down when the high-performance controller is an LLM reasoning over an open-world domain. The harness is positioned to bound failure, not prove safety, and the Failure Mode and Threat Matrices exist to make the residual coverage gaps explicit.
- The Utility Monitor treats over-restriction as a measured failure mode. A supervisor so conservative it destroys the system's value is its own failure (Utility Decay). The classical safety-supervisor literature optimizes for safety alone and has no equivalent concept; instrumenting the safety-versus-value tradeoff is where this work aims to add something new.
| Document | Description |
|---|---|
| Technical Architecture Brief | Full architecture specification with diagrams, interface contracts, and design rationale |
| Network to Model Interface | Interface contract between the network and model output layers |
| Model to Harness Interface | Interface contract between the model output layer and the deterministic harness |
| Failure Mode Matrix | 12 identified fault modes with detection thresholds and three-tier response protocols |
| Threat Matrix | 15 identified threats mapped to architectural controls, including cascading and governance risks |
This is a high-level architecture, not a tutorial or a drop-in implementation. The pattern was developed against a single cyber-physical case (CNC thermal monitoring); the layering generalizes to other high-stakes domains — autonomous vehicles, medical diagnosis, financial trading — but the physical-grounding mechanisms (Digital Twin, cross-modal sensor fusion) do not transfer for free. Each new domain must supply its own ground-truth substitute.
Explicitly out of scope: training-time data security, model alignment and underlying capabilities, supply-chain security beyond the network layer, and systematic prompt-level red-teaming.
Contributions are welcome. Open an issue for bugs or suggestions, or fork and submit a pull request. Please keep contributions aligned with the project's security and privacy principles.
I'm Bob Scheller. I lead enablement platforms at Merck Animal Health, where I am the primary inventor on the patent for the Whisper biosensor — an ML-based animal-health monitoring system. I am leading data platform and software development teams supporting commercial, manufacturing and R&D activities in 30+ countries. The work in this repository is personal and independent, and does not represent my employer.
This proposal comes out of a specific kind of experience: shipping a probabilistic ML system into real-world, physically-grounded deployment at scale, and watching where the gap between "the model is usually right" and "the system must never do the wrong thing" actually bites. Sensor disagreement, jurisdictional data boundaries, the slow drift between a model's training distribution and the world it runs in, and the commercial reality that a safety layer too conservative to ship is its own kind of failure — these aren't abstractions to me. The Deterministic Harness generalizes the patterns that mattered in that deployment into an architecture for agentic systems, where both the autonomy and the stakes are higher.
I'm publishing it as a proposal rather than a finished system because I'd rather have the coverage gaps argued over in public than papered over.




