feat(otel): client-side redaction of PII and/or secrets by simonvdk-mistral · Pull Request #584 · mistralai/client-python

simonvdk-mistral · 2026-07-02T12:35:37Z

What

Adds a client-side redaction layer for OpenTelemetry spans so PII and secrets never leave the machine. The core primitive is reusable by any OTEL application, and the Mistral SDK installs it automatically when it owns the exporter.

Also add documentation and examples for the observability features of the SDK.

The primitive

RedactingSpanExporter wraps any SpanExporter and redacts each span before delegating the actual export:

from mistralai.extra.observability.redaction import RedactingSpanExporter

exporter = RedactingSpanExporter(OTLPSpanExporter(...))          # default policy
provider.add_span_processor(BatchSpanProcessor(exporter))

Redaction covers the whole span surface: attributes, events, links, resource attributes, span name, and status description.

The module requires the optional OpenTelemetry SDK (the telemetry extra) to run, but not to import — so it stays importable in environments without the extra.

Why an exporter wrapper (and not span-creation-time redaction)

Redaction is deliberately placed in a SpanExporter decorator, invoked at export time, rather than when spans/attributes are created. In dedicated mode the SDK wires:

BatchSpanProcessor(RedactingSpanExporter(OTLPSpanExporter(...)))

This has real, concrete benefits:

Off the request hot path. With BatchSpanProcessor (what the SDK installs), export runs on a dedicated background thread, in batches. The regex scanning and span rebuild happen there — not on the threads serving the user's chat/embeddings calls — so redaction adds no latency to application requests. Cost is amortized across a batch and absorbed by the exporter thread; under extreme load it manifests as export backpressure, never as slower API calls.
Last line of defense. Sitting at the very edge, right before bytes leave the process, it sees every span regardless of which instrumentation produced it. Nothing is exported without passing through redaction (fail-closed).
Composable and reusable. It's a plain SpanExporter decorator with no Mistral coupling, so any OpenTelemetry application can wrap its own exporter — which is exactly what users do in global/custom-provider mode.

Caveat: the "off the hot path" property depends on an asynchronous processor. BatchSpanProcessor (installed by the SDK) gives it; a SimpleSpanProcessor would export — and therefore redact — synchronously on span end.

Policies

Policy	Strategy	Trade-off
`RegexRedactionPolicy` (default, `redaction=True`)	Content-oriented: keeps keys and structure, redacts matched substrings (secret tokens plus PII — emails, card-like sequences, IPv4).	Redacts most sensitive data while preserving observability value; may miss free-form PII or secrets not in the pattern set.
`AttributeRedactionPolicy`	Key-oriented: redacts whole values for sensitive keys (explicit set, fragment match, or non-primitive value), then scans kept values for secret token patterns.	Very conservative, but erases most prompt/response content.
`CallbackRedactionPolicy` (`redaction=<callable>`)	Your `(key, value) -> value \| None` masker per attribute; return `None` to drop the attribute.	Full control; you own the logic.

SDK integration

configure_telemetry gains a redaction argument:

True (default) — default (regex) policy
False — redaction disabled
a RedactionPolicy instance (e.g. AttributeRedactionPolicy())
a (key, value) -> value | None callback

Redaction only applies in dedicated provider mode, where the SDK owns the exporter. In global/custom-provider modes the application owns the export pipeline, so the argument is ignored and a warning is logged — wrap your own exporter with RedactingSpanExporter in that case.

Tests

New test_redaction.py covering the policies, span rebuild, and the exporter
wrapper.
Extended test_telemetry.py for the redaction wiring and the
ignored-argument warnings.

Fold AWS, Google, JWT, PEM and Stripe key patterns into DEFAULT_TOKEN_PATTERNS and compose DEFAULT_PII_SECRET_PATTERNS from it, with tests covering each secret and the composition invariant. Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe <vibe@mistral.ai>

Use a conditional base (SpanExporter under TYPE_CHECKING, object at runtime) so linters verify the export/shutdown/force_flush overrides while keeping the OpenTelemetry SDK an optional import. Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe <vibe@mistral.ai>

Add a shared _redact_value helper so both policies scan the string elements of list/tuple attribute values instead of passing them through verbatim, preserving the container type and leaving numeric/bool sequences untouched. Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe <vibe@mistral.ai>

Rewrite test_redaction.py as pytest classes with fixtures and parametrization, and convert the TestTelemetryRedaction class in test_telemetry.py to a plain pytest class (using caplog for log assertions). Optional span attributes/context are narrowed with asserts instead of file-level pyright suppressions. Older telemetry tests are left as-is. Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe <vibe@mistral.ai>

gitguardian-eu · 2026-07-02T15:03:43Z

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.

^{_{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}}

simonvdk-mistral · 2026-07-02T15:04:51Z

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secrets in your pull request

GitGuardian id GitGuardian status Secret Commit Filename
289842 Triggered Shopify Generic App Token e8f0122 src/mistralai/extra/tests/test_redaction.py View secret
289840 Triggered PostHog Project API Key e8f0122 src/mistralai/extra/tests/test_redaction.py View secret
🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.

Replace and store your secrets safely. Learn here the best practices.

Revoke and rotate these secrets.

If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials

install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

These are all fake/dummy secrets to test redaction

Switch default_redaction_policy() from the key-oriented AttributeRedactionPolicy to the content-oriented RegexRedactionPolicy, which preserves keys/structure and redacts only matched secret/PII substrings. Update docstrings, README policy table, the example, and adapt the behavioural tests accordingly. Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe <vibe@mistral.ai>

th-ch · 2026-07-03T15:05:30Z

+_PRIMITIVE_TYPES: Final[tuple[type, ...]] = (str, bool, int, float)
+
+
+class AttributeRedactionPolicy(RedactionPolicy):


nit: worth separating the implementations over multiple files?

th-ch · 2026-07-03T15:38:04Z

+from mistralai.extra.observability import AttributeRedactionPolicy, configure_telemetry
+
+
+def main() -> None:
+    api_key = os.environ["MISTRAL_API_KEY"]
+
+    with Mistral(api_key=api_key) as client:
+        configure_telemetry(client, redaction=AttributeRedactionPolicy())


In the various examples, worth showing how to extend the defaults and provide an example where an attribute will be redacted? e.g.

Suggested change

from mistralai.extra.observability import AttributeRedactionPolicy, configure_telemetry

def main() -> None:

api_key = os.environ["MISTRAL_API_KEY"]

with Mistral(api_key=api_key) as client:

configure_telemetry(client, redaction=AttributeRedactionPolicy())

from mistralai.extra.observability import (

AttributeRedactionPolicy,

configure_telemetry,

)

from mistralai.extra.observability.redaction import DEFAULT_SENSITIVE_ATTRIBUTE_KEYS

def main() -> None:

api_key = os.environ["MISTRAL_API_KEY"]

server_url = os.environ.get("MISTRAL_SERVER_URL")

with Mistral(api_key=api_key, server_url=server_url) as client:

configure_telemetry(

client,

redaction=AttributeRedactionPolicy(

sensitive_keys=DEFAULT_SENSITIVE_ATTRIBUTE_KEYS

| {"telemetry.sdk.language"}

),

)

rbarbadillo

generally lgtm! just a small edge-case to cover

rbarbadillo · 2026-07-03T18:58:21Z

    if hook._auto_telemetry_provider is not None:
        return True


Should this early return take replace_existing into account?

Suggested change

if hook._auto_telemetry_provider is not None:

if not replace_existing:

return True

_shutdown_telemetry_provider(hook)

I think an app can reasonably do:

configure_telemetry(client) # ... some beautiful code ... configure_telemetry(client, redaction=False)

but the second call keeps the first RedactingSpanExporter because we return here before rebuilding the provider. Maybe when replace_existing=True, we should shutdown/recreate the auto provider so the new redaction setting actually applies.

simonvdk-mistral and others added 11 commits July 1, 2026 17:59

feat: first shot

e090ffd

fix: tweaks

7e82903

fix: resolve redaction and policy changes

e0e910c

fix: tweaks

2dfee80

fix: additional sensitive keys

5adab3b

docs: add observability docs

5588e01

feat: more secret patterns

e8f0122

simonvdk-mistral requested a review from a team July 2, 2026 17:23

simonvdk-mistral marked this pull request as ready for review July 2, 2026 17:24

th-ch reviewed Jul 3, 2026

View reviewed changes

rbarbadillo reviewed Jul 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(otel): client-side redaction of PII and/or secrets#584

feat(otel): client-side redaction of PII and/or secrets#584
simonvdk-mistral wants to merge 12 commits into
mainfrom
svdk/feat/client_side_masking

simonvdk-mistral commented Jul 2, 2026 •

edited

Loading

Uh oh!

gitguardian-eu Bot commented Jul 2, 2026 •

edited

Loading

Uh oh!

simonvdk-mistral commented Jul 2, 2026

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

Uh oh!

th-ch Jul 3, 2026

Uh oh!

th-ch Jul 3, 2026

Uh oh!

rbarbadillo left a comment

Uh oh!

rbarbadillo Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		_PRIMITIVE_TYPES: Final[tuple[type, ...]] = (str, bool, int, float)


		class AttributeRedactionPolicy(RedactionPolicy):

Uh oh!

Conversation

simonvdk-mistral commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

The primitive

Why an exporter wrapper (and not span-creation-time redaction)

Policies

SDK integration

Tests

Uh oh!

gitguardian-eu Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

️✅ There are no secrets present in this pull request anymore.

Uh oh!

simonvdk-mistral commented Jul 2, 2026

⚠️ GitGuardian has uncovered 2 secrets following the scan of your pull request.

Uh oh!

th-ch Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

th-ch Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

rbarbadillo left a comment

Choose a reason for hiding this comment

Uh oh!

rbarbadillo Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

simonvdk-mistral commented Jul 2, 2026 •

edited

Loading

gitguardian-eu Bot commented Jul 2, 2026 •

edited

Loading