Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
<!--
SPDX-License-Identifier: Apache-2.0

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Agent Guide for Apache Synapse

This file is read by automated agents (security scanners, code analyzers,
AI assistants) operating on this repository.

## Security

Security model: [SECURITY.md](./SECURITY.md) -> [THREAT_MODEL.md](./THREAT_MODEL.md)

Agents that scan this repository should consult `SECURITY.md` and the linked
`THREAT_MODEL.md` before reporting issues. Synapse is a mediation engine / ESB:
the untrusted message sender is the adversary, while the integration
configuration (sequences, scripts, XSLT/XQuery bodies, endpoint addresses) is
trusted operator input. The model covers transports, XSLT/XQuery transforms
(XXE/SSRF), dynamic endpoint routing (SSRF), and message DoS.
14 changes: 14 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Security Policy

## Reporting a Vulnerability

Apache Synapse follows the [Apache Software Foundation security process](https://www.apache.org/security/).
Please report suspected vulnerabilities **privately** to `security@apache.org` (the Synapse PMC is reachable
at `private@synapse.apache.org`). Do **not** open public GitHub issues or pull requests for security reports.

## Threat Model

What Synapse treats as in/out of scope, the security properties it provides and disclaims (safe-by-default
XML transforms, mediation-level security, secret protection), the adversary model (the untrusted message
sender vs. the trusted integration configuration), and how findings are triaged are documented in
[THREAT_MODEL.md](./THREAT_MODEL.md).
262 changes: 262 additions & 0 deletions THREAT_MODEL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,262 @@
<!--
SPDX-License-Identifier: Apache-2.0

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Threat Model — Apache Synapse

## §1 Header

- **Project:** Apache Synapse — a lightweight, high-performance **Enterprise Service Bus (ESB) / mediation
engine**. It accepts messages over pluggable transports (HTTP/S, JMS, VFS, Mail, …), runs them through
operator-defined **mediation sequences / proxy services** (mediators: XSLT, XQuery, script, filter,
switch, send-to-endpoint, …), and routes/transforms them toward backend **endpoints** *(documented — README;
source `org.apache.synapse.mediators`, `config.xml`)*.
- **Modelled against:** `apache/synapse` `master`/HEAD (2026-05-31).
- **Status:** **DRAFT — v0, not yet reviewed by the Synapse PMC.** Produced by the ASF Security team via the
`threat-model-producer` rubric (<https://gist.github.com/potiuk/da14a826283038ddfe38cc9fe6310573>).
- **Reporting / version-binding / legend** as in the sibling models. **Draft confidence:** ~12 documented /
0 maintainer / ~48 inferred. Each *(inferred)* routes to §14.

**Framing note (as for any framework):** Synapse is a *mediation engine*, not a finished application. The
**integration developer** authors the synapse configuration — sequences, mediators, scripts, XSLT/XQuery,
endpoints, and security policies. That configuration is **trusted input** (§3); the **inbound message from a
network client is the untrusted adversary input** (§7). Most properties are conditional on how the
integration is configured, so §9/§10 carry a lot of weight.

## §2 Scope and intended use

Intended use *(documented)*: deploy Synapse as a message broker/mediator in front of or between services —
clients send messages to a Synapse proxy/API; Synapse mediates (transform, route, secure, throttle) and
forwards to backend endpoints.

Caller roles:

- **Message client (untrusted)** — any peer that can send a message to a Synapse listener/proxy/API.
- **Backend endpoint** — a service Synapse calls; semi-trusted (its responses re-enter mediation).
- **Integration developer / operator** — authors the synapse config (mediation logic, scripts, XSLT,
endpoints, secure-vault secrets, transport + WS-Security policy). **Trusted; out of model as adversary (§3).**

**Component-family table:**

| Family | Entry point | Touches outside process | In model? |
| --- | --- | --- | --- |
| Transport listeners | HTTP/S (NHTTP/passthrough), JMS, VFS, Mail | network / fs / mail | **Yes** |
| Mediation engine | sequences / proxy services / APIs | — | **Yes** |
| XML transform mediators | **XSLT**, **XQuery**, payload factory | XML; **external refs** | **Yes (high-value)** |
| Script mediators | JS/Groovy/… (operator-authored) | runs config code over message data | **Yes (data-in surface)** |
| Endpoints (outbound) | send/call mediators, address/WSDL/loadbalance | **network egress** | **Yes (SSRF surface)** |
| Eventing | WS-Eventing subscriptions | network | **Yes** |
| Secrets / secure-vault | encrypted config secrets | keystore | **Yes** |
| Samples / docs / build | `modules/documentation`, samples, tests | — | No → §3 |

## §3 Out of scope (explicit non-goals)

- **The integration developer / operator as adversary**, and the **synapse configuration** itself (sequences,
scripts, XSLT/XQuery bodies, endpoint addresses, secrets). Config is authored by a trusted party; a script
mediator running operator-authored code is not an adversary surface — the message *data flowing into* it is
*(inferred)*.
- **Misconfiguration** (enabling external-entity resolution, routing to an attacker-derived endpoint without
validation, disabling TLS) — Synapse provides the controls; using them is the operator's job (§10/§11).
- **Backend services** Synapse mediates to, and the message producers' own security.
- **Samples, documentation, and tests** *(inferred)*.
- **The underlying XML/crypto stacks** (the JAXP/StAX provider, Rampart/WSS4J) except as Synapse configures
and invokes them.

## §4 Trust boundaries and data flow

The trust boundary is the **transport listener + the mediation entry**: bytes arriving on a listener are
untrusted until mediation (and any configured WS-Security/transport auth) has processed them *(inferred)*.

Trust transitions:

1. **Wire → message build:** the transport builds a message (SOAP/XML/JSON/binary). XML building is the
XXE / entity-expansion / large-message DoS surface *(inferred — wave-1)*.
2. **Message → XSLT/XQuery mediator:** transforms may resolve external entities, `document()` / `doc()`
references, or extension functions — an **XXE / SSRF / file-read** surface if external resolution is enabled
*(inferred — `XSLTMediator`; high-value, §14)*.
3. **Message → script mediator:** operator-authored JS/Groovy runs with message data as input. The *code* is
trusted (config); the risk is unsafe handling of message data inside it *(inferred)*.
4. **Message → endpoint resolution:** static endpoints are config (trusted); **dynamic / content-based
routing** that derives an endpoint address from message content is an **SSRF** surface *(inferred)*.
5. **Endpoint response → mediation:** backend responses re-enter mediation as semi-trusted input.

**Reachability precondition:** a finding is in-model if reachable from an inbound message *before* the
mediation auth/validation the integration configured; a finding requiring a malicious **config** (script,
XSLT body, endpoint address chosen by the operator) is `OUT-OF-MODEL: trusted-input` (§3/§6).

## §5 Assumptions about the environment

- JVM host running the Synapse runtime; operator-managed `synapse.xml` config, keystores, and transport setup.
- Transports reachable per operator network config; TLS provided by the transport configuration *(inferred)*.
- Secrets via secure-vault are protected by an operator-managed keystore/password *(inferred)*.
- **What Synapse does to its host (*(inferred)* — wave-2):** binds transport listeners; opens **outbound**
connections to configured (and possibly dynamically-resolved) endpoints; reads config + keystores; XSLT/
XQuery may fetch external references if enabled. Not assumed to spawn host processes beyond configured
command/script mediators.

## §5a Build-time and configuration variants

| Knob (names *(inferred)*) | Effect | Ruling needed |
| --- | --- | --- |
| XML secure-processing / DTD + external-entity resolution in builders & XSLT/XQuery | XXE / SSRF / file-read on inbound transforms | **Open (wave-1):** are external entities/`document()` off by default? |
| Message size / element-depth / streaming limits | XML/large-message DoS | **Open (wave-1)** |
| Dynamic / content-based endpoint resolution | SSRF if endpoint derived from message | Open — validated/allow-listed? |
| Transport TLS (HTTPS listener + outbound) | Confidentiality/integrity | Operator (§10) |
| WS-Security (Rampart) on a proxy | Message-level auth/sig/enc | Integration choice |
| Script-mediator languages enabled | Operator-code surface | Operator config |

## §6 Assumptions about inputs

| Entry point | Parameter | Attacker-controllable? | Caller/operator must enforce |
| --- | --- | --- | --- |
| transport listener | message body (SOAP/XML/JSON/binary), headers, SOAPAction | **yes** | XML limits; transport/WS-Security; size caps |
| XSLT/XQuery mediator | message payload (the transform *input*) | **yes** | disable external entity/`document()` resolution |
| script mediator | message payload passed to the script | **yes** | safe handling of message data in the script |
| dynamic endpoint | endpoint address *derived from message* (if used) | **yes (if configured)** | validate/allow-list resolved addresses |
| synapse config (sequences, scripts, XSLT, endpoints, secrets) | all | **no — operator-trusted** | never sourced from a message |

## §7 Adversary model

- **Primary adversary:** an untrusted client sending messages to a Synapse listener/proxy/API. Capabilities:
craft SOAP/XML/JSON payloads (XXE, entity-expansion, oversized), drive content that influences XSLT/XQuery
resolution, supply data that a dynamic route turns into an endpoint address (SSRF), or that a script
mishandles.
- **Secondary:** a malicious backend endpoint returning hostile responses into mediation.
- **Goals:** XXE/file-read/SSRF via transforms or routing; XML/message DoS; bypass of a configured
mediation-level auth; exfiltration of secrets reachable through a transform.
- **Out of model:** the integration developer/operator; the config (scripts, XSLT bodies, endpoint
addresses); keystore/secret holders.

## §8 Security properties the project provides

*(Conditional on configuration; *(inferred)* pending §14.)*

1. **Robust message building/parsing.** Malformed/oversized inbound messages yield a fault, not memory
corruption or unbounded resource use (subject to configured limits) *(inferred)*. *Symptom:* crash/hang/OOM
from crafted input. *Severity:* high.
2. **Safe-by-default XML transforms.** XSLT/XQuery and message builders do not resolve external entities/
`document()` against untrusted input unless explicitly enabled *(inferred — load-bearing; wave-1)*.
*Symptom:* XXE read / SSRF / file disclosure via a transform. *Severity:* critical.
3. **Mediation-level security mechanisms.** When configured, transport security and WS-Security (Rampart)
authenticate/sign/encrypt messages *(inferred)*. *Symptom:* accepted unauthenticated/forged message where
policy required otherwise. *Severity:* critical.
4. **Secret protection.** Secure-vault keeps configured secrets encrypted at rest, not in plaintext config
*(inferred)*. *Symptom:* plaintext secret exposure. *Severity:* high.
5. **Transport security support.** TLS on HTTPS listeners and outbound calls with cert validation when
configured *(inferred)*. *Symptom:* MITM where TLS expected. *Severity:* high.

## §9 Security properties the project does NOT provide

- **No security without configuration** — a proxy with no transport/WS-Security and permissive transforms is
only as protected as the integration wired it *(inferred)*.
- **No defence against the integration developer** — scripts, XSLT/XQuery bodies, and endpoint addresses are
trusted config (§3).
- **No intrinsic SSRF protection for dynamic/content-based routing** — if an endpoint is derived from message
content, validating it is the integration's job *(inferred)*.

**False friends:**

- *An XSLT/XQuery transform looks like pure data transformation but can read files / fetch URLs* via external
entities, `document()`/`doc()`, or extension functions if external resolution is left enabled.
- *A script mediator looks sandboxed but runs with the engine's privileges* — it is operator code, not a
security boundary for message data.
- *Content-based routing looks like internal plumbing but can become SSRF* when the route target is
attacker-influenced.

**Well-known attack classes to keep in view:** XXE and XML entity-expansion DoS; SSRF via XSLT `document()`/
external entities and via dynamic endpoint resolution; oversized-message / streaming DoS; injection into a
downstream system via an unsanitized transform; secret exposure through an over-broad transform; XML
signature-wrapping where WS-Security is used (see the CXF/WSS4J model).

## §10 Downstream (integrator/operator) responsibilities

- **Keep external-entity / DTD / `document()` resolution disabled** in message builders and XSLT/XQuery on
untrusted inbound paths; keep message-size/depth limits on.
- **Validate or allow-list** any endpoint address derived from message content (anti-SSRF).
- Configure transport TLS (with cert validation) and WS-Security where the integration requires
authentication/integrity.
- Treat script/XSLT/XQuery mediator bodies as code you own; don't accept them from untrusted sources.
- Protect the secure-vault keystore/password; don't commit plaintext secrets.

## §11 Known misuse patterns

- Exposing a proxy with no transport/message security and assuming the ESB "is secure".
- Enabling external-entity / `document()` resolution in XSLT/XQuery over untrusted messages.
- Deriving an endpoint address from message content without validation (SSRF).
- Embedding secrets in plaintext config instead of secure-vault.
- Routing untrusted message content into a script mediator that then executes/concatenates it unsafely.

## §11a Known non-findings (recurring false positives)

*(v0 seed — the PMC will own the authoritative list — §14.)*

- **A script/XSLT/XQuery mediator "executes code"** — operator-authored config (§3/§9); not a finding unless a
*default* path lets an untrusted message reach unsafe resolution.
- **XXE/SSRF reachable only when the operator enabled external resolution** — `OUT-OF-MODEL: non-default-build`
unless the *default* resolves external entities (then `VALID` — wave-1).
- **SSRF via an endpoint address the operator configured statically** — trusted input (§6).
- **Findings in samples / documentation / tests** — out of scope (§3).
- **Use of a weak algorithm explicitly configured** in a WS-Security policy — integration choice.

## §12 Conditions that would change this model

- A change to default XML/transform external-resolution or size-limit posture.
- A new transport, mediator, or default that resolves untrusted references.
- Dynamic endpoint resolution becoming on/permissive by default.
- A change in secure-vault or WS-Security defaults.
- Any report not cleanly routable to a §13 disposition.

## §13 Triage dispositions

| Disposition | Meaning | Licensed by |
| --- | --- | --- |
| `VALID` | Violates a claimed property via an in-scope adversary/input in a default config. | §8, §6, §7 |
| `VALID-HARDENING` | No §8 property broken, but a §11 misuse warrants a safer default/guard. | §11 |
| `OUT-OF-MODEL: trusted-input` | Requires control of the synapse config (script/XSLT/endpoint/secret). | §6, §3 |
| `OUT-OF-MODEL: adversary-not-in-scope` | Requires operator/keystore capability. | §7, §3 |
| `OUT-OF-MODEL: unsupported-component` | Lands in samples/docs/tests. | §3 |
| `OUT-OF-MODEL: non-default-build` | Only when an insecure non-default transform/resolution option was enabled. | §5a |
| `BY-DESIGN: property-disclaimed` | Concerns a §9-disclaimed property (no security without config; scripts are operator code). | §9 |
| `KNOWN-NON-FINDING` | Matches a §11a entry. | §11a |
| `MODEL-GAP` | Routes to none of the above → revise the model. | §12 |

## §14 Open questions for the maintainers

**Wave 1 — transform/parse defaults (decide VALID-vs-misconfig; §5a/§8):**
1. By default, do the **message builders and XSLT/XQuery mediators disable DTD / external-entity / `document()`
resolution** on untrusted inbound messages, so an XXE/SSRF-via-transform report against defaults is `VALID`?
*Proposed:* external resolution off by default; enabling it is operator opt-in.
2. Are there **default message-size / element-depth / streaming limits** that bound XML/large-message DoS?
*Proposed:* configurable limits; sensible defaults.

**Wave 2 — routing & scripts (§4/§9):**
3. Is **dynamic / content-based endpoint resolution** something an untrusted message can influence by default,
and is the resolved address validated/allow-listed? *Proposed:* static endpoints are the norm; dynamic
resolution is opt-in and the integration validates it (SSRF = integration responsibility).
4. Confirm **script / XSLT / XQuery mediator bodies are trusted config** (operator-authored), so "code
execution in a mediator" is `OUT-OF-MODEL: trusted-input` rather than a framework finding. *Proposed:* yes.

**Wave 3 — secrets, WS-Security, §11a (§8/§11a):**
5. How does **secure-vault** protect secrets, and what does Synapse claim about secret exposure through
transforms/logging? *Proposed:* encrypted at rest; avoid logging secrets.
6. What do scanners most often (re)report here that the PMC considers a **non-finding**? (Seeds §11a.)

**Meta:**
7. Confirm this model lives as root `THREAT_MODEL.md` referenced from a new `SECURITY.md`. *Proposed:* yes.

## §15 Machine-readable companion

Deferred for v0; a `threat-model.yaml` can later encode the §6 trust table, §2/§3 scoping, §8 rows, §9 false
friends, §11a non-findings, and §13 dispositions.
Loading