Skip to content

contenox/runtime

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

761 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Contenox

AI workflows you can run, review, and own.

Use AI for development without unlearning how to do it yourself.

Contenox is an open-source, local-first AI workflow runtime for engineers. It packages repeatable AI-assisted work into versioned Chains: files that declare the prompt, model route, tool allowlist, command policy, retry behavior, branches, budgets, and human approval gates.

The agent loop does the work. The Chain is the contract.

Contenox is for work where AI may touch a terminal, repository, internal API, ticket system, browser, or production-adjacent data, and where "the model decided" is not an acceptable control boundary.

Run the same Chain from the CLI, VS Code, or any ACP client. Route inference to a local model, a private-network backend, or a hosted provider. Sessions, config, run logs, and runtime state stay on your machine. No hosted Contenox service required.

Go License Version

Docs: contenox.com


Install

curl -fsSL https://contenox.com/install.sh | sh

Prefer to inspect first?

curl -fsSLO https://contenox.com/install.sh
less install.sh
sh install.sh

Release downloads and source builds are available from the releases page.


Quick Start

contenox setup                          # choose a provider/model for this machine
contenox "say hello world in python"    # use it from the CLI
contenox chat -e                        # open $EDITOR to compose a prompt

Resume past work with contenox session list and contenox session switch <name>.

Inline editor autocomplete is intentionally a separate model from chat, so ghost text can stay local and low-latency while chat uses a larger model:

contenox config set default-provider              openai          # chat on a hosted model
contenox config set default-model                 gpt-5-mini
contenox config set default-autocomplete-provider llama           # ghost text on local modeld
contenox config set default-autocomplete-model    qwen3-coder-30b-a3b

In VS Code, enable it with Contenox: Enable Autocomplete.


Why Chains?

A naked agent loop is useful, but it is not enough when AI can touch real tools.

A Chain answers the questions a serious team has to ask before letting a model act:

  • What is the task?
  • Which model or provider may be used?
  • Which tools may the model call?
  • Which commands or API operations are allowed?
  • What must stop for human approval?
  • What state, trace, and evidence does the run leave behind?
  • Can the workflow be reviewed, committed, diffed, and run again?

In Contenox, a Chain is not a prompt pipeline. It is the reviewed execution contract around an agent loop.


What You Author

The unit of work is a Chain: a single versioned file where every decision is a visible JSON key. Prompts, provider routing, tool scope, command policy, retry policy, token limits, loop budgets, and branches are part of the artifact you review.

{
  "id": "review",
  "token_limit": 65536,
  "tasks": [
    {
      "id": "review",
      "handler": "chat_completion",
      "system_instruction": "You are a code reviewer. Analyze the diff, run tests if tools are available, then give a concise review.",
      "execute_config": {
        "model": "{{var:model}}",
        "provider": "{{var:provider}}",
        "tools": ["local_shell", "local_fs"],
        "tools_policies": {
          "local_shell": {
            "_allowed_commands": "go,make,npm,cargo,grep,cat",
            "_denied_commands": "sudo,su,dd,mkfs,fdisk,parted,shred"
          },
          "local_fs": {
            "_allowed_dir": ".",
            "_max_read_bytes": "262144"
          }
        },
        "retry_policy": {
          "max_attempts": 4,
          "initial_backoff": "1s",
          "max_backoff": "30s",
          "jitter": 0.25,
          "rate_limit_min_wait": "10s"
        }
      },
      "transition": {
        "branches": [
          {
            "operator": "edge_traversed_at_least",
            "edge": "review->run_tools",
            "when": "6",
            "goto": "end"
          },
          { "operator": "equals", "when": "tool_call", "goto": "run_tools" },
          { "operator": "default", "goto": "end" }
        ]
      }
    },
    {
      "id": "run_tools",
      "handler": "execute_tool_calls",
      "input_var": "review",
      "execute_config": {
        "tools": ["local_shell", "local_fs"]
      },
      "transition": {
        "branches": [
          { "operator": "default", "goto": "review" }
        ]
      }
    }
  ]
}

Save it, then pipe your work into it. It speaks Unix:

git diff | contenox run --chain ./review.json

HITL is not a hidden toggle. Gated tool calls route through policy files such as hitl-policy-default.json, hitl-policy-strict.json, and editor-specific ACP policies. The Chain defines what the workflow can ask for; the active policy decides what must pause for approval before execution.

Walk through your first chain: contenox.com/docs/guide/first-chain.


What It Is Good For

Contenox is strongest when the workflow is specific and repeatable: known inputs, known tools, known output shape, and explicit review gates.

  • Review a diff - run tests, summarize risk, and gate on approval before anything destructive runs.
  • Draft release evidence - turn git log, PRs, tickets, and CI output into a changelog, risk notes, deployment checklist, and reviewer packet.
  • Wrap an internal API - expose a safe OpenAPI subset with hidden tenant/env args and approval required on mutating calls.
  • Automate repo chores - take an issue, produce a patch, run checks, and write the PR description.
  • Inspect operational systems - query dashboards, shell scripts, or MCP tools through scoped policies instead of broad credentials.
  • Use edge autocomplete - keep VS Code ghost text on a local or local-network coder model while chat uses a larger hosted model.

The same Chain runs from the terminal, VS Code, Zed, JetBrains, AionUi, or any ACP client. Provider choice is config: local modeld, Ollama, vLLM, OpenAI, OpenRouter, Anthropic, Mistral, Gemini, AWS Bedrock, or Vertex.


For Work That Touches Real Systems

Contenox is built for high-consequence engineering workflows: production repos, internal APIs, infrastructure scripts, operational dashboards, release processes, and systems of record.

In those environments, "AI agent" cannot mean "give a model broad credentials and hope." It needs a runtime boundary: explicit tools, explicit policy, local state, human approval, and reviewable evidence.

The problem is not that agents can act. The problem is agents acting outside a boundary you authored.

The model can reason, inspect, and propose. The Chain decides what it may touch. The operator decides what it may change.


What the Runtime Protects

Risk Contenox mechanism
Agent behavior disappears into chat history Chains are files: reviewable, versionable, repeatable
The model can touch too much Tool allowlists, command policies, and scoped API specs
Human review happens after damage Destructive actions stop at approval gates before execution
Internal APIs become broad agent tools Curated OpenAPI subsets with hidden environment/tenant args
Vendor choice becomes workflow lock-in Provider/model routing is config, not application logic
Routine work burns frontier-model budget Route simple work to local or private-network models
Team knowledge leaves the workstation Sessions, state, config, and run logs stay local

The Stance: Exoskeleton, Not Autopilot

AI is becoming part of software work. The question is whether it makes you sharper or more dependent.

Naive AI use turns engineering judgment into rented fluency: useful while the model is good, reachable, current, and affordable. But the durable value in software was never the typing. It was knowing what to build, knowing what changed, and being able to own the system when it breaks.

Contenox is built as an exoskeleton, not an autopilot. It amplifies the person doing the work. You stay in the loop because the workflow, tools, state, and approval policy are things you author and review.

Contenox is not an autonomous coding employee. It is not a hosted autopilot. It is not a prompt habit hidden in your shell history.

It is a local runtime for AI-assisted work that still has an owner.


Where Contenox Fits

Contenox is the agent layer you control from terminal to editor.

Nearby world Contenox stance
IDE copilots Editor assistance is not enough. The workflow should run from terminal, VS Code, and ACP clients.
CLI coding agents A single coding loop is not a runtime. Contenox adds sessions, tool policy, provider routing, and review gates.
LangChain / agent frameworks Libraries are not the product. Contenox is an executable local runtime for end users and teams.
Dify / n8n / web workflow tools AI workflows that touch local code and tools should not require a SaaS control plane.
Ollama wrappers A model host is not a workflow boundary. Contenox adds Chains, tools, HITL policy, and routing across local, private, and hosted models.

Connect Your Stack

Anything reachable over MCP, an OpenAPI spec, or a shell command can become a scoped tool in a Chain:

# Any MCP-compatible server
contenox mcp add notion https://mcp.notion.com/mcp --auth-type oauth

# Any HTTP API with an OpenAPI spec
contenox tools add erp_billing \
  --url https://erp.internal.example.com \
  --spec ./billing-subset.yaml

# The shell, with your own command policy declared in the Chain
contenox --shell "check Proxmox and flag anything red"

Use It From Your Editor

Contenox speaks the Agent Client Protocol over stdio, so the same local Chains run inside Zed, JetBrains, AionUi, and other ACP clients. For Zed, drop this into ~/.config/zed/settings.json:

{
  "agent_servers": {
    "Contenox": {
      "type": "custom",
      "command": "contenox",
      "args": ["acp"]
    }
  }
}

Tool calls render as cards with the real command/path, HITL prompts route through the editor's permission UI, and session history replays when you reopen the project.

Full guides: Zed, JetBrains, AionUi.


Backends

Provider/model routing is configuration, not application logic. Add local, private-network, or hosted backends the same way:

# Private network / self-hosted inference
contenox backend add ollama --type ollama
contenox backend add myvllm --type vllm --url http://gpu-host:8000

# Hosted AI vendors
contenox backend add openai \
  --type openai \
  --api-key-env OPENAI_API_KEY
contenox backend add openrouter \
  --type openrouter \
  --api-key-env OPENROUTER_API_KEY
contenox backend add anthropic \
  --type anthropic \
  --api-key-env ANTHROPIC_API_KEY
contenox backend add mistral \
  --type mistral \
  --api-key-env MISTRAL_API_KEY
contenox backend add gemini \
  --type gemini \
  --api-key-env GEMINI_API_KEY
contenox backend add bedrock \
  --type bedrock \
  --url https://bedrock-runtime.us-east-1.amazonaws.com
contenox backend add vertex \
  --type vertex-google \
  --url "https://us-central1-aiplatform.googleapis.com/v1/projects/$GOOGLE_CLOUD_PROJECT/locations/us-central1"

# Set your defaults
contenox config set default-model    qwen3-8b
contenox config set default-provider llama

The llama and openvino backends are local modeld-backed providers. contenox init registers them and contenox model pull <name> downloads artifacts into ~/.contenox/models/<backend>/. Current normal CLI and VS Code release packages do not bundle modeld yet, so local modeld providers require a source build: modeld Source Build and Packaging.


modeld North Star

Routine tokens should be local or private by default.

modeld is Contenox's local-inference north star: one owner, one active local model, many persistent sessions, and resident coding context on a workstation accelerator.

This is the direction of the local backend, not a guarantee for every model, device, or release package today.

modeld is shaped around one specific bet: a local coding agent on a single consumer accelerator that serves real, long-context work. The goal is an effective context far beyond a model's native window on limited hardware by treating context as resident state kept hot rather than a prompt resent every turn.

  • One model, one user, many sessions. The device's whole memory and KV budget go to making that model deep and fast instead of multiplexing several.
  • Warm-reuse sessions. Each session keeps its stable prefix's KV hot and re-prefills only the changed suffix, so a long working context is paid for once.
  • Snapshot / restore. Session state is durable and branchable, so effective context outlives a single live process.
  • Accelerator-driven, no knobs. modeld detects the accelerator and derives offload and the effective window from the device at runtime.

Longer term, modeld is also where Contenox can make local models adapt to the workstation: resident context, reusable sessions, and optional adaptation such as LoRA where it makes sense.

How it maps onto the code: Effective Context North Star.


Build from Source

Requires Go 1.25+.

git clone https://github.com/contenox/runtime
cd runtime
make build-contenox

# Build and run local modeld (llama.cpp)
CONTENOX_MODELD_BACKEND=llama make run-modeld

# Build and run local modeld (OpenVINO)
make deps-modeld
CONTENOX_MODELD_BACKEND=openvino make run-modeld

See modeld Source Build and Packaging for the complete local modeld flow and relocatable bundles.


Built on

The contenox CLI is pure Go. Local inference lives in the separate modeld daemon, which links these upstream projects at build time (pinned in mk/llama-flags.mk and mk/openvino-flags.mk) and ships their runtime libraries inside each release package:

Project Role License
llama.cpp GGUF inference and the ggml CPU/CUDA/HIP/Metal backends MIT
OpenVINO Inference runtime (CPU / iGPU / NPU) Apache-2.0
OpenVINO GenAI LLM pipeline over OpenVINO Apache-2.0
OpenVINO Tokenizers Tokenizer extension for OpenVINO GenAI Apache-2.0
minja Chat-template engine (vendored by OpenVINO GenAI) MIT
gguf-tools GGUF parsing headers (vendored by OpenVINO GenAI) see upstream

Upstream license texts travel with the artifacts (licenses/ in dependency bundles, LICENSES/ in modeld packages). Other Go dependencies are in go.mod.


Questions: hello@contenox.com

About

Local-first AI workflows for engineers.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Contributors