Multi-tenant Buzz relay: community_id as a server-resolved key (comprehensive rewrite)#1321
Draft
tlongwell-block wants to merge 75 commits into
Draft
Multi-tenant Buzz relay: community_id as a server-resolved key (comprehensive rewrite)#1321tlongwell-block wants to merge 75 commits into
tlongwell-block wants to merge 75 commits into
Conversation
…fence buzz-core gets the zero-I/O tenant identity types every scoped layer shares. TenantContext encodes conformance row-zero in the type system: no Default, no Deserialize, no public constructor except resolved(), which is meant to be called only from host resolution. Downstream code holds &TenantContext and can read but not mint a community, so client-chosen-community cannot type-check outside resolution. Co-authored-by: Eva <011987e296fd5006292d2f930b574be47c7801048d1983c46c425d3c95f0cffd@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
The frozen base for the multi-tenant rewrite. Consolidated 0001 schema makes community_id a first-class, server-resolved key on every scoped row, mapped table-by-table to docs/multi-tenant-conformance.md. Schema highlights: - channels PK is (community_id, id): the same channel UUID may legitimately co-exist in two communities; child FKs (channel_members, workflows, thread_metadata) are composite (community_id, channel_id) so a child can never reference a cross-community channel — DB-enforced, not by handler discipline. channels.community_id is immutable (BEFORE UPDATE trigger). - communities.host uniqueness is UNIQUE(lower(host)); normalize_host applies the same rule on the resolution side, so case/dot/default-port variants can never split one tenant into two. - every scoped unique/PK leads with community_id; cross-community dedup of the same signed event is allowed, within-community dup rejected. - new tables: communities (host map), scheduled_workflow_fires (the cron at-most-once claim), audit_log (per-community chain), and an explicit _operator_global_tables registry the migration lint reads. buzz-core: - normalize_host(host): the one shared host-canonicalization rule. - TenantContext fence doc corrected to say plainly it is a lint-and-review fence, not a compiler fence (resolved()/from_uuid are pub) — honest about the guarantee the API actually gives. Schema proven against Postgres with an adversarial fence suite (re-tenant rejected, cross-community FKs rejected, same-UUID/same-event cross-community allowed, host-case collision rejected). buzz-core: 189 tests + 2 doctests green. Folds in review round 1 from Mari (channel global-uniqueness leak, host normalization, fence-claim honesty) and Sami (NIP-98 localhost normalization to be dropped in the auth lane). Co-authored-by: Eva <011987e296fd5006292d2f930b574be47c7801048d1983c46c425d3c95f0cffd@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Closes the last Lane-0 schema items before the frozen base:
- events.search_tsv TSVECTOR GENERATED ALWAYS AS to_tsvector('simple',
content) STORED + GIN idx_events_search_tsv. The Typesense->Postgres FTS
data shape, landed in Lane 0 because it touches the just-locked events
table (Quinn option A). GENERATED ALWAYS = single source of truth: proven
against PG that a client cannot forge search_tsv out of sync with content
(generated_always rejection). Index left minimal single-column GIN; the
search lane picks the final spelling after EXPLAIN (Max's caveat).
- Delete stale 0002_backfill_d_tag.sql / 0003_event_reminders.sql. In the
consolidated-from-scratch model 0001 already carries d_tag, not_before,
delivered_at, and idx_events_not_before; re-running the old additive
migrations would error (duplicate column / duplicate index name).
audit_log DDL shape confirmed for the audit-crate collapse (Dawn's lane):
PRIMARY KEY (community_id, seq), UNIQUE (community_id, hash), community_id
NOT NULL on every row. 0001 is the single source; buzz-audit drops its own
schema.rs / AUDIT_SCHEMA_SQL / ensure_schema() in the audit lane.
Re-proven against real Postgres — full fence suite green: T1 re-tenant
rejected, T6 cross-community member FK rejected, T6b same-community ok, T7
same channel UUID in two communities allowed, T8 host case-collision
rejected, T9 same event id in two communities allowed, plus the FTS
generated+GIN match and the forge-rejection. buzz-core: 189 + 2 doctests.
Co-authored-by: Eva <011987e296fd5006292d2f930b574be47c7801048d1983c46c425d3c95f0cffd@sprout-oss.stage.blox.sqprod.co>
Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Co-authored-by: Mari <95cae996907d7cab9f5dbf43c0f53edeac6ab0b032a6feae4abfd784e467b3f5@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Co-authored-by: Mari <95cae996907d7cab9f5dbf43c0f53edeac6ab0b032a6feae4abfd784e467b3f5@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Add EventQuery::for_community so relay call sites can keep concise struct updates without restoring a tenantless Default. The constructor requires the server-resolved CommunityId and preserves the old optional filter defaults everywhere else. Return the owning community host from the ephemeral-channel reaper by joining communities in the archive UPDATE. Reaper consumers can now build TenantContext per archived row from DB-resolved community+host instead of hoisting or forging a batch-level tenant. Co-authored-by: Mari <95cae996907d7cab9f5dbf43c0f53edeac6ab0b032a6feae4abfd784e467b3f5@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
RateLimiter::check_and_increment now takes &TenantContext, and
rate_limit_key emits buzz:{community}:ratelimit:{pubkey_hex}:{suffix}.
Same pubkey active in two communities consumes two independent quotas,
matching the S1 cross-community isolation fence in the buzz-relay
rewrite spec.
check_ip_connection stays operator-global by design. The IP fence runs
at connection acceptance, before host->community resolution has
completed (or, on resolve failure, instead of it). Threading
&TenantContext through it would invert the order of operations. Per-
(community, IP) caps, if ever needed as a tenant-fairness signal,
belong in an additive LimitType keyed on (community, ip) — not in this
trait.
RedisRateLimiter in buzz-pubsub follows the new trait signature.
AlwaysAllowRateLimiter test impl mirrors it. Two new tests pin the
behavior: the key includes the community prefix, and same-pubkey-two-
communities yields two distinct Redis keys.
Local cargo test -p buzz-auth: 36 passed. Local cargo test -p
buzz-pubsub: 3 passed, 6 Redis-required ignored. Workspace-wide check
not run locally (sqlx 0.9.0 requires rustc 1.94, local toolchain is
1.89 — same constraint Max hit on the pubsub lane); relying on CI for
the full integration compile.
(cherry picked from commit 6a92f0b)
Co-authored-by: Sami <f4a42a97e594b77bdbd8ee35191c8b28a94a4cb871d96f32921558275421fb68@sprout-oss.stage.blox.sqprod.co>
Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Adds the §5 pre-build gate for multi-tenant replay protection.
buzz-auth gains a Nip98ReplayGuard trait plus the
nip98_replay_key(ctx, event_id) helper. The trait's try_mark contract
requires atomic set-if-absent semantics; an in-process cache (moka,
DashMap) does not carry the freshness proof across pods under the
"any pod, any connection" architecture (§4B), so the production
implementation MUST be shared state. The Redis-backed impl lives in
buzz-pubsub as RedisNip98ReplayGuard and uses a single SET key 1 NX
EX <ttl> per claim.
Key shape: buzz:{community}:nip98:{event_id_hex}. Event ids are
content-addressed so natural cross-community collision is zero, but
the gate is fail-closed isolation — a same-id replay across
communities must consult two distinct seen-set rows, not one shared
row. Tests pin both the prefix and the cross-community isolation
guarantee.
TTL floor is DEFAULT_REPLAY_TTL_SECS = 120, matching the §5 gate
requirement and the doubled NIP-98 ±60s timestamp tolerance.
Implementations MAY clamp sub-floor TTLs up to the floor; they MUST
NOT honor smaller values. The Redis impl clamps.
Caller contract documented in the trait: verify first, then mark.
Burning a seen-set slot on a forgery would let an attacker who learns
a future event id DoS the legitimate event. On Err (Redis
unreachable) callers MUST fail closed.
Not wired into a call site in this commit — there is no NIP-98 HTTP
handler in Lane 0 yet. Eva's relay-wiring lane will consume the trait
when the HTTP path lands; the contract is documented for that
integration.
Validation:
- cargo test -p buzz-auth --lib ✅ 40 passed (4 new in nip98_replay).
- cargo test -p buzz-pubsub --lib ✅ 3 passed, 9 Redis-required
ignored (3 new in nip98_replay).
- cargo test -p buzz-pubsub --lib nip98_replay -- --ignored against
local Redis ✅ 3 passed: first-claim/replay, cross-community
isolation, sub-floor TTL lifted to floor.
- Workspace check not run locally (sqlx 0.9.0 / rustc 1.94 vs local
1.89); CI catches it.
(cherry picked from commit a2a9ef4)
Co-authored-by: Sami <f4a42a97e594b77bdbd8ee35191c8b28a94a4cb871d96f32921558275421fb68@sprout-oss.stage.blox.sqprod.co>
Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
…acing Red-team pass against the auth lane surfaced one real bug and two robustness gaps. All three caught by tests, the bug verified by temporarily reverting the fix and watching the test fail with the real Redis error. 1. Real bug: a caller passing ttl_secs > i64::MAX (e.g. u64::MAX from a config bug) caused Redis to return "ResponseError: value is not an integer or out of range" from `SET NX EX <ttl>`. RedisNip98ReplayGuard then returned Err, the trait contract forces callers to fail closed, and every NIP-98-gated request from that point would have errored with no visible link back to the bad config. Fix: introduce MAX_REPLAY_TTL_SECS (1 hour — 30× the natural physical maximum, well inside i64::MAX) and clamp ttl_secs into [DEFAULT, MAX] before the SET. New ignored-Redis test `above_ceiling_ttl_is_clamped` exercises the path with u64::MAX and asserts the claim+replay sequence succeeds, which it only does with the clamp. 2. Robustness: pin "all rate-limit and replay key components are lowercase ASCII" as a unit-level invariant. If pubkey::to_hex, Uuid::Display, or LimitType::key_suffix ever started emitting uppercase, the same logical (community, pubkey/event_id) would map to two distinct Redis keys — silently doubling the rate-limit quota or breaking the seen-set's identity. Two new tests (`rate_limit_key_components_are_lowercase`, `key_components_are_lowercase`) catch the regression in CI rather than production. 3. Robustness: structured tracing on every Redis failure path with `community = %ctx.community()` as a structured field, so ops can group log alerts by tenant without needing the community id to be embedded in the AuthError string. The user-facing AuthError::Internal payload stays the existing convention (consistent with rate_limit.rs neighbors); the per-tenant context lives in tracing fields, not in the error string. Also: add `ttl_floor_below_ceiling` and `max_ttl_fits_in_redis_signed_ex` unit tests so the two TTL constants can't drift past each other or above Redis's signed-EX limit in a future edit. Out of scope for this lane (flagged to other lane owners): - AuthError::Internal generally embeds raw downstream error strings (existing pattern across rate_limit.rs and nip98_replay.rs). Could leak community/tenant identifiers if those strings ever surface to clients. Audit lane (Quinn) owns the error-message safety rule per Eva's [6] lane split. - check_ip_connection MUST be called before host resolution / on every connection (including failed-host-resolution attempts). Otherwise an attacker who picks a non-matching host header bypasses the IP cap. Wiring lives in the relay-wiring lane (Eva). Validation: - cargo test -p buzz-auth --lib: 44 passed (4 new red-team tests). - cargo test -p buzz-pubsub --lib: 3 passed, 10 Redis-required ignored. - cargo test -p buzz-pubsub --lib nip98_replay -- --ignored against local Redis: 4 passed (1 new ceiling-clamp test). - Bug verified: with the clamp temporarily reverted, the above_ceiling_ttl_is_clamped test fails with the real Redis error "value is not an integer or out of range" — proving the test catches the regression, not just the fix. (cherry picked from commit f54d728) Co-authored-by: Sami <f4a42a97e594b77bdbd8ee35191c8b28a94a4cb871d96f32921558275421fb68@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Two adversarially-proven multi-tenant fences for the auth lane on the frozen Lane 0 SHA: 1. NIP-98 verifier: drop loopback aliasing unconditionally. normalize_url() collapsed localhost / ::1 -> 127.0.0.1 — a testing convenience that becomes a row-zero side door under multi-tenant. The u-tag host is the community binding (docs/multi-tenant-conformance.md, NIP-98 row); collapsing the three would let an event signed for localhost pass against a 127.0.0.1-resolved community (or vice versa). Inverted the localhost test to bite the new strict rule: signed-for-one vs expected-other now REJECTS, identity still passes. Adversarial: re-introduced the aliasing -> test goes red -> restored. 2. ChannelAccessChecker: thread &TenantContext through every method. Frozen 0001 has channels PK (community_id, id), so the same UUID legitimately co-exists across communities. A bare WHERE id = implementation would be a cross-community existence oracle. Mirror of buzz-db rule 4a.1 on the auth side. MockAccessChecker keyed on (community, pubkey, channel_id); new test access_does_not_cross_communities bites the bare-id direction. Adversarial: dropped the community filter from the mock -> test goes red -> restored. No external impl of ChannelAccessChecker in-tree (DB uses a separate free function under Mari's lane), so the trait signature change is contained. cargo test -p buzz-auth: 45 passed / 0 failed. Lane: auth (buzz-auth). Base: e349d76 (frozen Lane 0). (cherry picked from commit 3df6179) Co-authored-by: Sami <f4a42a97e594b77bdbd8ee35191c8b28a94a4cb871d96f32921558275421fb68@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
(cherry picked from commit 69237ef) Co-authored-by: Max <d8473ee32b973aa31a21a65adddcc4b69cc2a8a4dee8121ecd51926e0cddbc02@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
(cherry picked from commit 4af7348) Co-authored-by: Max <d8473ee32b973aa31a21a65adddcc4b69cc2a8a4dee8121ecd51926e0cddbc02@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
The Lane-0 freeze landed `events.search_tsv TSVECTOR GENERATED ALWAYS AS
(to_tsvector('simple', content)) STORED` + `GIN (search_tsv)` directly in
the schema. With that in place the entire Typesense apparatus is dead
weight: there is nothing to index out-of-band, no consistency window to
reason about, no client-forgeable index/content drift. Indexing is the
SQL write.
This rewrites `crates/buzz-search/` from scratch around that:
- `query.rs`: one SQL builder. `community_id = $ctx` is the first
predicate of every executed statement and is unconditional —
`SearchQuery` requires a `CommunityId` at the type level (no
construction path omits it). `search_tsv @@ websearch_to_tsquery(...)`
is the FTS predicate; `ts_rank_cd DESC, created_at DESC, id` is the
order. Channel scope replaces today's `__global__` sentinel with
`channel_id IS NULL`. Empty query short-circuits without a roundtrip.
- `lib.rs`: thin `SearchService { pool }`. Takes `&PgPool` directly so
the crate stays a leaf — no buzz-db dependency. Re-exports
`CommunityId` for callers that need to mint the fence.
- `error.rs`: collapsed to one variant (`Db(sqlx::Error)`); empty
queries are not errors.
- Deleted `collection.rs` and `index.rs` (Typesense HTTP client and
indexer). Dropped `reqwest`/`serde`/`serde_json`/`chrono`/`nostr`
from `Cargo.toml`.
- Added `tests/fts_integration.rs` — 8 integration tests against real
Postgres, each on its own throwaway schema applying the frozen
`migrations/0001_initial_schema.sql` via `include_str!`. The
load-bearing one is `search_does_not_return_other_community_events`:
mutating the `community_id = $ctx` predicate to `1=1` makes that
test go red (verified, then reverted) — the fence bites where it
has to.
Conformance row 50 — search re-auth and one-shot NIP-50 — is unchanged
in shape: the relay refetches canonical events per hit through buzz-db's
scoped fetcher and runs the access predicate. Search is never the
access boundary; this crate just returns candidate ids. The row's
Typesense prose rewrite is owned by Eva's integration lane (one writer
per path).
EXPLAIN ANALYZE evidence on a 200k-row community confirms the planner
picks `Bitmap Index Scan on events_p<...>_search_tsv_idx` for the
populated partition (full plan in RESEARCH/SEARCH_LANE_FTS_EXPLAIN.md
in the workspace). Single-column `GIN (search_tsv)` is sufficient at
this scale — no `btree_gin` needed (Max's caveat holds).
Cross-lane removals owed to Eva (relay-wiring lane, not this commit):
- relay state.rs: remove `search_index_tx` mpsc + worker
- relay main.rs: remove `search.ensure_collection()` call
- relay handlers/event.rs: remove `search_index_tx.send()`
- relay api/bridge.rs::handle_bridge_search: rewrite to new API
- relay handlers/req.rs::handle_search_req: rewrite to new API
- relay handlers/req.rs::build_search_channel_scope_filter: delete
- relay bin/reindex_kind0.rs: delete
- docker-compose.yml: drop typesense service + volume
- docs/multi-tenant-conformance.md row 50: rewrite Typesense prose
Tests: `cargo test -p buzz-search --test fts_integration --
--include-ignored --test-threads=1` — 8 passed, 0 failed.
Clippy: `cargo clippy -p buzz-search --all-targets -- -D warnings` — clean.
(cherry picked from commit e31c098)
Co-authored-by: Quinn <96f056ad5f2305c8ddf637dc65d048aa4c12d7daeb8867690e34fca46b0ef64c@sprout-oss.stage.blox.sqprod.co>
Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
The legacy 2x2 `(channel_ids: Option<Vec<Uuid>>, include_channel_less: bool)` shape could not unambiguously express "channel-less events only" — both `Some(vec![]) + true` and `None + true` fell into the no-constraint branch, silently broadening to all community channels rather than restricting to `channel_id IS NULL`. That matched the legacy Typesense `channel_id:=__global__` sentinel one way (per-channel + global) but not the other (global only).
Replace with a single `ChannelScope` enum whose four variants are 1-to-1 with the legacy `(accessible_channels, include_global)` matrix:
- non-empty + true -> ChannelsOrChannelLess(accessible)
- non-empty + false -> Channels(accessible)
- empty + true -> ChannelLessOnly (the variant the old shape could not express)
- empty + false -> caller short-circuits to EOSE, doesn't call search
Emitted SQL fragments are byte-identical to the legacy match for the three carry-over cases; `ChannelLessOnly` adds `AND channel_id IS NULL` — the fence the old type could not express.
Verification:
- Full package `cargo test -p buzz-search -- --include-ignored --test-threads=1`: 9/9 green (8 existing + 1 new `channel_less_only_excludes_per_channel_events`).
- Adversarial mutation: replaced the `ChannelLessOnly` SQL emission with a no-op (the buggy semantic the old shape produced); new test went RED with 3 hits instead of 1, restored, green again. The fix is the emitted predicate, not the variant name.
- clippy -D warnings clean; fmt clean.
- Empty-vec edge cases are intentionally not special-cased: `Channels(vec![])` emits `channel_id = ANY('{}')` (false-for-all, zero hits, preserves the old early-return semantic via SQL); `ChannelsOrChannelLess(vec![])` is equivalent to `ChannelLessOnly`.
Coordinated with Eva ahead of relay-wiring sweep at req.rs and bridge.rs so call sites land against the final type, not the buggy one.
(cherry picked from commit c8cd333)
Co-authored-by: Quinn <96f056ad5f2305c8ddf637dc65d048aa4c12d7daeb8867690e34fca46b0ef64c@sprout-oss.stage.blox.sqprod.co>
Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Convert the audit log from one global hash chain to an independent per-community chain, conforming to the frozen Lane-0 0001 schema. - Collapse to one DDL: delete schema.rs / AUDIT_SCHEMA_SQL and their lib.rs exports. The 0001 migration is the sole owner of audit_log. - Chain shape: PK (community_id, seq), seq monotonic per-community, UNIQUE (community_id, hash); hash/prev_hash/actor_pubkey as BYTEA; object_id TEXT generalizes the old event_id/channel_id; detail JSONB. - community_id is folded into the SHA-256 (leads the hash) so a row cannot be lifted out of one community's chain and re-verified in another. Per-community advisory lock — communities never serialize each other's audit writes (no throughput bottleneck, no timing oracle). - verify_chain / get_entries scoped to a CommunityId. - Error variants carry only per-community seq (meaningless without its chain) — never community_id, hash values, or raw action strings. - AUTH-body protection becomes caller discipline + the AuditAction enum (AuthSuccess/AuthFailure carry outcome metadata, never the token); the dropped event_kind column is not persisted. 13/13 green (7 unit + 6 Postgres isolation). Adversarial: disabling the community_id line in compute_hash turns community_id_is_part_of_identity RED (two communities hash identically); restored to green. (cherry picked from commit ba11d66) Co-authored-by: Dawn <c6237ef84fa537c78dcee78efd2d4e59f728859c7f194da42ac51ededfa0be05@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Make the provenance fence visible in the type signature, not a per-call-site convention. `NewAuditEntry.community_id` becomes `CommunityId` (the server-resolved newtype) instead of a raw `Uuid`, so a wiring call site can no longer pass an arbitrary UUID off the event/channel being acted on — the only doors to a `CommunityId` are host resolution or a server-scoped DB row, never client input. The DB-row type `AuditEntry` stays `Uuid`: sqlx reads/writes it directly and `compute_hash` does `.as_bytes()` on it, so the stored hash bytes are byte-for-byte identical and the already-integrated chain stays valid — no migration, no re-hash. The `as_uuid()` dereference moves inside `AuditService::log` at the DB boundary, where the column is written; the advisory-lock key is unchanged (CommunityId's Display delegates to Uuid). Drop the now-orphaned `Serialize`/`Deserialize` derive (and the `#[serde(default)]` on `detail`) from `NewAuditEntry`: it has no serde consumer — it travels only through the in-process audit sink (mpsc), never a wire/DB boundary. Keeping it non-deserializable reinforces the fence: no client blob can mint a NewAuditEntry. Full package green (13/13, incl. the 6 PG isolation tests and the community_id_is_part_of_identity fence); clippy -D warnings + fmt clean. Adversarially verified the fence is non-vacuous: dropping community_id from compute_hash turns community_id_is_part_of_identity RED, restored. (cherry picked from commit 284cc69) Co-authored-by: Dawn (sprout agent) <c6237ef84fa537c78dcee78efd2d4e59f728859c7f194da42ac51ededfa0be05@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
…ind) Conformance row zero: req.community = resolve_host(connection.host), bound before any handler observes tenant data. This lands the relay-side seam: - HostResolver trait (native async fn, no async-trait dep) — buzz-db's Db::resolve_host satisfies it; the relay depends on the trait, not the query, so the binding is testable without a database. Callers are generic over R, no dyn dispatch (the relay holds a concrete Db). - bind_community(): normalizes the host with the one shared rule, resolves it, and fails closed on BOTH unmapped host AND lookup error — there is no path that yields a default/fallback community. UnmappedHost is a distinct variant the call site turns into a GENERIC reject (no host echo, no unmapped-vs-error distinction) so an unauthenticated caller can't probe which hosts exist. - TenantContext carries the normalized host, so downstream NIP-05/audit labelling and the NIP-98 u-host check all see the canonical form the community was resolved from. Tests (4, green) cover known-host bind, variant normalization (case/dot/ default-port can't split a tenant), unmapped fail-closed, and lookup-error fail-closed-not-default. Adversarially verified: mutating the None arm to fall through to a nil default community turns unmapped_host_fails_closed RED. Seam contract for the buzz-db lane (Mari): Db::resolve_host(&self, normalized_host: &str) -> Result<Option<CommunityId>, DbError>, a SELECT id FROM communities WHERE host = $1 on the normalized key. Router call site (nip11_or_ws_handler) + threading TenantContext through handle_connection land next in this lane. (cherry picked from commit 0be8532e0e94e5ecd6529f2f3f52255dd36f6009) Co-authored-by: Eva <011987e296fd5006292d2f930b574be47c7801048d1983c46c425d3c95f0cffd@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
…ing (§5b)
Plan §5b, decided by Tyler: rather than sticky-route huddles or ship a
silent split-room, a horizontally-scaled deployment surfaces a clear,
client-handleable unavailable signal on huddle join.
- config: huddle_audio_available bool, env BUZZ_HUDDLE_AUDIO_AVAILABLE.
Defaults true so single-pod (N=1) deployments keep today's huddle
behavior unchanged. Operators running multiple relay pods set it false.
- audio handler: after auth + membership pass and BEFORE get_or_create joins
a room, if huddle_audio_available is false we send
{type:error, code:huddle_audio_unavailable, message:...} and return — no
silent room join whose frames never cross pods.
Why a config flag and not pod-count self-detection: the relay can't reliably
count its own pods; an explicit operator flag is the honest model and keeps
the §4 fork-B (any-pod-any-connection) generic routing free of huddle
stickiness. The real fix is the out-of-relay media/SFU service (Tyler's
long-term target), out of scope for this rewrite.
Tests: default-true (N=1 compat) and env-false-disables, both green. Full
buzz-relay --lib green at --test-threads=1 (374). Note for this lane: there
is a pre-existing parallel-run env-var race (global_presence_pubsub test
calls Config::from_env without the config tests' ENV_MUTEX guard) — not a
regression from this change; flagged to fix in the wiring lane.
(cherry picked from commit cc2bc29d4429da9e1a3e80217936340a4c1ca721)
Co-authored-by: Eva <011987e296fd5006292d2f930b574be47c7801048d1983c46c425d3c95f0cffd@sprout-oss.stage.blox.sqprod.co>
Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Executable form of docs/multi-tenant-conformance.md: one module per obligation-table surface row (14 surfaces, 18 isolation tests) plus the N=1 parity gate documented against the existing e2e suites. Each A/B isolation test addresses two hosts (RELAY_URL_A/RELAY_URL_B) on the SAME relay process — one binary, one Postgres, one Redis, two communities — proving no tenant-observable state crosses a boundary derived from host, never caller input. All #[ignore] (need a running two-host relay) so a normal cargo test run reports 0 passed / 18 ignored; they cannot fake-pass. Rows the lane hasn't landed yet panic via pending_lane(lane, obligation), which names the exact obligation for the owner to fill in and makes the remaining work one grep. Lane ownership tagged per module. (cherry picked from commit 9d6d35f07a17fcf5ccd8a6f20fdede3349e67024) Co-authored-by: Eva <011987e296fd5006292d2f930b574be47c7801048d1983c46c425d3c95f0cffd@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
…row) The conformance obligation for the NIP-11 surface: RelayInfo::build must not grow unscoped DB/search/audit inputs, so an unauthenticated NIP-11 read can never become a cross-community enumeration oracle. Binds RelayInfo::build to its exact allowed signature via a const fn pointer. Adding a &Db / &AppState / search / audit input makes the function-pointer type stop matching and breaks the build at the fence — a silent cross-tenant leak becomes a hard compile error, deny-lint style. Adversarially proven: injecting a &AppState param into build() produces error[E0308] mismatched types at the fence const (plus E0061 at the call sites); reverted to confirm the fence, not the call sites alone, is the guard. buzz-relay package 374 green at --test-threads=1. (cherry picked from commit 76a4044c7cfb1c96a6817be1e81c7ae42d1ea3da) Co-authored-by: Eva <011987e296fd5006292d2f930b574be47c7801048d1983c46c425d3c95f0cffd@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Archived identity state is tenant-local; a pubkey archived in one community must not read as archived in another. Thread CommunityId through the archived identity queries and DB wrappers, and bind the composite key used by the migration. Co-authored-by: Mari <95cae996907d7cab9f5dbf43c0f53edeac6ab0b032a6feae4abfd784e467b3f5@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Threads server-resolved `community_id`/`TenantContext` through the whole relay call graph and the operator CLI against the v3 DB/pubsub API, so every scoped row read and every Redis publish names a community the relay derived from data, never from caller input. Relay (`crates/buzz-relay`): - Read-path caches take `CommunityId`; write/invalidate publishers take `&TenantContext` (the Redis topic key needs the host). The cross-node fan-out path only has the community, so caches stay constructible there. - Doors fail closed: WS/bridge/media/NIP-05 bind community from the request host via `bind_community`, falling through to an empty/404 response on an unmapped host — no default tenant, no host echo. - Background loops get tenant from the DB row they act on: the reaper builds `TenantContext::resolved(row.community_id, row.host)` per archived channel from the reaper RETURNING; the dev/CI reconciler and reminder scheduler resolve the one configured community from `relay_url`, fail-closed. - Deployment-community cases with no connection tenant (git hook/finalize, workflow sink) resolve via the same host-resolution seam. - Drop the Typesense-only `reindex_kind0` backfill binary, obsolete under the Postgres FTS migration and referenced nowhere. Admin CLI (`crates/buzz-admin`): - New `resolve_admin_tenant` reads `RELAY_URL` host (the CLI runs `compose exec relay buzz-admin`, sharing the relay's env) and resolves it via `lookup_community_by_host`, fail-closed on an unmapped host. - Scope the NIP-43 membership-list publish (`EventTopic::Global`), channel reconcile, `get_members`, and the kind:39000 existence `EventQuery` (`..EventQuery::for_community`). Drop the now-dead `uuid` dep. Workspace gate: `cargo check --workspace` green; buzz-db 97/97, buzz-audit 13/13, buzz-relay 375 + main 1 (`--include-ignored --test-threads=1`), buzz-admin compiles, fmt + buzz-admin clippy clean. Co-authored-by: Eva <011987e296fd5006292d2f930b574be47c7801048d1983c46c425d3c95f0cffd@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Relay E2E applies schema/schema.sql as declarative desired state before the relay starts. The multi-tenant migration added FKs to communities, but the snapshot did not define the table, so pgschema failed before tests ran. Co-authored-by: Mari <95cae996907d7cab9f5dbf43c0f53edeac6ab0b032a6feae4abfd784e467b3f5@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
CI Rust Lint + Windows Rust run `cargo clippy --workspace --all-targets -- -D warnings`; the community_id/tenant args pushed six fns to 8/7 and the new NIP-98 replay code tripped clamp/const-assert lints. Resolve at the bar, matching existing repo conventions: - 6x #[allow(clippy::too_many_arguments)] on the fns that gained a tenant/community arg (same convention already used across buzz-db/relay). - buzz-pubsub replay TTL: .max().min() -> .clamp() (floor 120 < ceiling 3600, cannot panic; behavior identical, incl. the u64::MAX clamp test). - buzz-auth replay const-drift tripwires: scoped #[allow(clippy::assertions_on_constants)] — the assert-on-constant IS the design (fails if someone drifts the TTL constants). Co-authored-by: Eva <011987e296fd5006292d2f930b574be47c7801048d1983c46c425d3c95f0cffd@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Relay E2E builds the database from schema/schema.sql via pgschema apply, while the rewrite migration had moved the first-class community_id schema forward. The snapshot was still mostly pre-MT, so it produced unscoped tables such as channels(id) instead of channels(community_id, id). Make the declarative snapshot match migrations/0001_initial_schema.sql exactly so the schema path and migration path create the same tenant-scoped shape. Co-authored-by: Mari <95cae996907d7cab9f5dbf43c0f53edeac6ab0b032a6feae4abfd784e467b3f5@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
3fb25c5 to
f1b459b
Compare
The relay resolves each connection's tenant from the durable communities host map and fails closed on an unmapped host. Under the MT schema, channels.community_id is NOT NULL with a FK to communities, so the pre-MT e2e seed (unscoped channel/member INSERTs against an empty communities table) fails, and every e2e client connection 404s at host-binding. The relay never auto-seeds a community (ensure_configured_community has no callers). Seed the deployment community (host=localhost:3000, matching RELAY_URL=ws://localhost:3000 after normalize_host keeps the non-default port) and thread community_id through the channel/member INSERTs: - setup-desktop-test-data.sh: insert the community row first, then scope every channel/member INSERT (Desktop E2E Integration). - start-relay-for-tests.sh: seed the community after schema apply (Relay E2E); psql-or-docker fallback since psql is not on PATH in hermit. - ci.yml backend-integration: seed after relay start (reconciler retries for 2min), before the NIP-ER reminder suite. ON CONFLICT targets lower(host) to match idx_communities_host, keeping the seed idempotent. Verified against live PG: schema apply clean (165 stmts), seed inserts 9 scoped channels + 19 scoped members with zero nulls, host resolves, re-run is idempotent. Adversarial: an unscoped channel INSERT fails not-null and a channel against a nonexistent community fails the FK, proving the community row is load-bearing. Co-authored-by: Eva <011987e296fd5006292d2f930b574be47c7801048d1983c46c425d3c95f0cffd@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Two CI-honesty follow-ups after the first seed surfaced host/ordering mismatches in the MT e2e path (no product behavior): 1. Desktop E2E 404: the seed-readiness helper queried 127.0.0.1:3000 while the relay reconciles + the community is seeded for localhost:3000. normalize_host keeps the non-default port and 127.0.0.1 != localhost, so the inbound host resolved to no community and every /query 404'd. Default the helper to http://localhost:3000, matching the rest of the desktop e2e suite (e2eBridge.ts / bridge.ts already use localhost) and the relay's RELAY_URL. 2. Backend Integration UnmappedHost: the reminder scheduler binds the deployment community once at boot and exits permanently on an unmapped host (no retry, unlike the channel reconciler). The community was being seeded after relay start, leaving the scheduler dead. Apply the schema and seed the community BEFORE starting the relay (dropping BUZZ_AUTO_MIGRATE since the schema is now applied up front), so the scheduler binds on its single boot-time attempt. Both are test/CI wiring. The Relay E2E suite stays red on a separate, gated body-level bug (command_executor.rs inserts events without community_id) tracked for the §4 scoping slice. Co-authored-by: Eva <011987e296fd5006292d2f930b574be47c7801048d1983c46c425d3c95f0cffd@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
The desktop e2e integration relay-boot still used BUZZ_AUTO_MIGRATE with no pre-boot community seed, so the channel reconciler bound the deployment community ONCE at boot (outside its retry loop) before setup-desktop-test-data.sh seeded it, hit UnmappedHost, and exited permanently. The reconciler's retry loop only covers late-seeded channels, not a late-seeded community — so the 9 seeded channels were never reconciled and 'loads channels from the relay' saw 0 channels (60s timeout). Both Desktop E2E Integration shards red. Mirror the proven backend-integration ordering: apply schema + seed the localhost:3000 community BEFORE the relay starts, and drop BUZZ_AUTO_MIGRATE (schema is now applied pre-boot). setup-desktop-test-data.sh's own idempotent community seed becomes a no-op; its channel INSERTs are then picked up by the reconciler's retry loop. Co-authored-by: Eva <011987e296fd5006292d2f930b574be47c7801048d1983c46c425d3c95f0cffd@sprout-oss.stage.blox.sqprod.co> Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
…delete
Conformance matrix row `search_fts` (Quinn, buzz-search). Fills the
`pending_lane("buzz-search", ...)` stub at
`crates/buzz-test-client/tests/conformance_multitenant.rs::mod search_fts`
with a two-host A/B-isolation shape: one keypair shared across A and B,
same channel UUID reused in both communities (legal under the
`(community_id, id)` PK), the *same* unique FTS token posted to each
community as kind:9 events but with **community-distinct content**.
NIP-50 search on each host must return exactly one hit carrying that
host's community's content; NIP-09 kind:5 delete in A leaves B's row
intact.
Bar (by my own hands against a live two-host relay on the conformance
recipe — Eva's `:3100` `relay-mt` harness, a/b.localhost, shared
PG/Redis, base PR head `bf8a1a4fa`):
Clean → GREEN.
Mutate (both community fences on the search read path, simultaneously):
crates/buzz-search/src/query.rs:160-161 (FTS WHERE community_id = $ctx)
crates/buzz-db/src/event.rs:870-872 (get_events_by_ids WHERE community_id = $1)
→ RED on `hits_a.len() == 1` with the failure message listing both
contents:
["A community probe ftsconf_…", "B community probe ftsconf_…"]
Restore both → GREEN.
Two contract surprises discovered by running the row against the live
relay (the lesson Eva established with nip11_relay_info — obligation
text under-determines the layer):
1. The community fence is doubly defended on the search read path: FTS
filters at the query layer, then `get_events_by_ids` re-filters at
the read layer. Mutating either fence alone keeps the
wire-observable property intact (the other defends). The honest
mutate-bite is to drop both simultaneously; that's what makes the
union load-bearing for the wire return. The test's doc comment
names both layers and explains why the single-layer mutation would
give a false-green.
2. With identical content in both communities, the Nostr event id is
the SAME byte string in both rows (id = hash(pubkey, created_at,
kind, tags, content); community is server-side provenance, not
serialized into id). Under a leak, the wire returns "the row
matching id" — which can be either community's row — and a count==1
assertion can't tell A's row from B's. Earlier iterations of this
test used identical content and discovered the hard way that
single-hit-with-other-community's-row is indistinguishable from
correct behavior at that assertion. Per-community-distinct content
(`"A community probe {token}"` vs `"B community probe {token}"`)
makes the leak observable: distinct content hashes to distinct ids
(different rows), and the assertion `hits[0].content == content_a`
pins which community's row came back.
Other discipline notes:
- Test is `#[ignore]` by default; selected with `-- --ignored`. Reads
two env vars: `RELAY_URL_A` / `RELAY_URL_B`, both addressing the same
relay process on different `Host` headers.
- Requires the two-host harness recipe (`BUZZ_HEALTH_PORT=8180
BUZZ_METRICS_PORT=9202 BUZZ_RECONCILE_CHANNELS=false
BUZZ_GIT_CONFORMANCE_PROBE=false`, two `communities` rows mapping
`a.localhost:3100` and `b.localhost:3100` to distinct community
UUIDs, one binary). Full recipe in the v2 dependency report at
`RESEARCH/CONFORMANCE_MATRIX_STATUS_2026-06-27.md`.
- Requires Sami's NIP-42 per-tenant relay-tag fix on PR head
(`bf8a1a4fa`) — without it, `BuzzTestClient::connect(&ws_a, &keys)`
fails AUTH on the non-configured host. The row was pre-positioned on
`809ff9faf` and rebased forward to PR head `bf8a1a4fa` exactly +1
commit; clean rebase (different file regions from
auth.rs/bridge.rs/nip42_host_binding_live.rs).
- Conformance lane: this row (`search_fts`) is one of fourteen in the
file; per Eva's row-ownership contract, this commit touches only the
`search_fts` module. No other rows modified.
Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…okup per-community
Conformance matrix row `users_profiles_nip05` (Quinn — buzz-search/auth
joint, both halves driven by Quinn per one-author-per-mod-block
discipline; Sami's active queue is `api_tokens_nip98_replay` per Eva's
batching). Fills both `pending_lane` stubs in the row.
Half 1: `same_pubkey_distinct_profiles_in_two_communities`
Same keypair publishes kind:0 (Metadata) on each host's WS-AUTH'd
connection with community-distinct content
(`{"display_name":"A profile"}` vs `{"display_name":"B profile"}`).
NIP-01 replaceable semantics: latest kind:0 per
`(community_id, pubkey)` is what subsequent queries return. REST
`POST /query` (using dev-mode `X-Pubkey` auth, which the
`BUZZ_REQUIRE_AUTH_TOKEN=false` harness allows) returns each host's
own kind:0 — never the other's. Distinct content per community is
load-bearing for the bite: identical content would collapse the leak
into setup-equivalence vacuity (Dawn's catch on `audit_log` —
identical Nostr event ids when (pubkey, created_at, kind, tags,
content) match — making the assertion blind to the wrong-row
substitution).
Half 2: `same_nip05_local_part_on_two_hosts_is_independent`
Same local-part registered in BOTH communities with **distinct**
pubkeys (one per community). `GET /.well-known/nostr.json?name=alice`
against host A returns A's pubkey; against host B returns B's pubkey.
Distinct pubkeys per community make the leak observable as
wrong-pubkey-returned on the wire — the same setup-equivalence-vacuity
defense Dawn established (different keys = different rows = the wrong
answer is observable in the response, not just absent from it). Handle
canonicalization uses `extract_relay_domain` (mirrors
`crates/buzz-relay/src/api/nip05.rs::extract_domain`) against
`RELAY_URL` env so the test still works if the harness's relay URL
changes; defaults to `localhost` for the standard recipe.
Bar (by my own hands against the live `:3100` harness, PR head
`b02d767f2`):
Clean → BOTH GREEN.
Mutate (community fences dropped, both paths simultaneously, mirroring
the search_fts dual-fence approach):
- crates/buzz-db/src/event.rs:267-270 — `query_events` non-p-tag
branch `WHERE community_id = $1` → `WHERE TRUE`
- crates/buzz-db/src/user.rs:185 — `get_user_by_nip05`
`WHERE community_id = $1 AND LOWER(handle) = LOWER($2)` →
`WHERE LOWER(handle) = LOWER($1)` (rebind to keep param count
aligned)
→ BOTH halves RED on their own assertions:
- kind:0 half: "B's kind:0 content is not B's profile — A's
profile leaked through. got: '{"display_name":"A profile"}'"
- NIP-05 half: "NIP-05 lookup on B for local-part 'alice_…' must
resolve to B's pubkey ($B_PK); got $A_PK. If this is A's
pubkey, the community fence on `get_user_by_nip05` has been
dropped and A's user leaked through B's lookup."
Restore both fences (worktree diff empty after restore) → BOTH GREEN.
Each half bit on a SINGLE-fence mutation this time, unlike search_fts's
defense-in-depth shape. That's because the kind:0 read path
(`query_events`) and the NIP-05 lookup path (`get_user_by_nip05`)
each have one community fence at their layer, not redundant fences
across two layers like the FTS+batch-fetch shape. Different rows have
different defense topologies; this row's mutate-bite is the simpler
single-fence form, exactly as the named failure-mode analysis in
`landed-on-head-discipline` rule #2 sub-bullet predicts (the union of
fences IS what makes the property load-bearing; here the union has
exactly one element per path).
Bar checklist:
- `cargo fmt -p buzz-test-client -- --check`: exit 0.
- `cargo clippy -p buzz-test-client --tests -- -D warnings`: clean.
- `cargo test -p buzz-test-client --tests`: non-ignored 0/0 (the 4
nip42_host_binding_live tests are #[ignore]'d by design — they need
the live harness).
- Strict-FF onto PR head: this is +1 on `b02d767f2`, verified by
`git merge-base --is-ancestor` + `git rev-list --count`.
- Trailers preserved as single pair via the explicit `--trailer`
flags on the initial commit (not `--amend --trailer`, which doubled
them earlier in the session).
- Live two-host harness on `:3100`: my clean (post-restore) binary,
recipe per RESEARCH/CONFORMANCE_MATRIX_STATUS_2026-06-27.md, two
community rows seeded `a.localhost:3100`/`b.localhost:3100`.
Conformance lane: this row is one of fourteen in the file; per Eva's
row-ownership contract, this commit touches only the
`users_profiles_nip05` module. No other rows modified.
Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…oexists in two communities Conformance matrix row `channels_membership` (re-routed from Mari to Quinn at Eva's call — Mari's #1328 scroll-fix is still landing on main, and the row's substrate is the same same-UUID-in-two-communities shape I already used as the setup for `search_fts`). Fills the `pending_lane("buzz-db", ...)` stub at `crates/buzz-test-client/tests/conformance_multitenant.rs::mod channels_membership`. The row's scope is the **positive arm** of the same `is_member_cached` scope branch that `row_zero_host_binding`'s `#h` override-attempt row exercises as the **negative arm**. Sibling-not-replacement, per the frame Dawn established when cold-reading row_zero (b): row_zero proves the override-attempt fails closed against `get_channel(A, U) == None`; this row proves the coexistence positive — when U exists in *both* A and B (legal under the `(community_id, id)` PK), `get_channel` finds the right per-community row and each community's posts land in its own instance. A bug that resolves `get_channel`/`is_member_cached` against the claimed community instead of the host-derived one would pass row_zero (b)'s negative-arm test (rejection still happens for some reason) but fail this row's positive-arm test (A's post might land in B's channel or be returned to B's query). So this row catches a class of bugs row_zero (b) structurally cannot, even though both share the `is_member_cached` scope branch. Shape: 1. One keypair shared across both communities — proves the fence is `community_id`, not `pubkey`. 2. Same channel UUID `U` created in both A and B via REST kind:9007. 3. Same key posts kind:9 with community-distinct content to U on each WS-AUTH'd connection ("A message in shared-UUID channel" / "B message in shared-UUID channel"). Distinct content per the named setup-equivalence-vacuity lesson in `landed-on-head-discipline` — without it, distinct rows would collide on Nostr event id (hash includes content; community is server-side provenance, not in the hash) and a leak would be indistinguishable from the honest path on the wire. 4. REST `POST /query` with `{kinds:[9], #h:[U]}` against each host. 5. Each side: count == 1, content == own community's. A leak surfaces as count == 2 (both rows returned through shared `#h: U` filter) OR content mismatch on count == 1. Bar (by my own hands against the live `:3100` harness, PR head `6aa0cec4a`): Clean → GREEN. Mutate (single fence — single-fence-per-path topology here, unlike search_fts's defense-in-depth): - crates/buzz-db/src/event.rs:266-270 — `query_events` non-p-tag branch `WHERE community_id = ` → `WHERE TRUE` (using the `let _ = q.community_id;` pattern Eva established on row_zero, one of the three honest sidesteps for the param-count trap I flagged in my prior message; the other two are renumbering and `( IS NOT NULL)`). → RED on `hits_a.len() == 1` with the failure message listing both contents: ["B message in shared-UUID channel", "A message in shared-UUID channel"] Distinct content makes the leak observable as B's message surfacing inside A's wire response. Restore → diff empty → GREEN. Bar checklist: - `cargo fmt -p buzz-test-client -- --check`: exit 0. - `cargo clippy -p buzz-test-client --tests -- -D warnings`: clean. - `cargo test -p buzz-test-client --tests`: full package non-ignored green (4 nip42_host_binding_live tests are #[ignore]'d by design). - Strict-FF onto PR head: this is +1 on `6aa0cec4a`, verified by `git merge-base --is-ancestor` + `git rev-list --count`. - Trailers preserved as single pair via initial `--trailer` flag (not `--amend --trailer`, which doubled them earlier in the session). - Live two-host harness on `:3100`: my clean (post-restore) binary, recipe per RESEARCH/CONFORMANCE_MATRIX_STATUS_2026-06-27.md v3. Conformance lane: this row is one of fourteen in the file; per Eva's row-ownership contract, this commit touches only the `channels_membership` module. No other rows modified. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…ken half + wire-driven NIP-98 replay half Fills both `pending_lane` stubs in `mod api_tokens_nip98_replay`: # `token_minted_in_a_does_not_authorize_in_b` — doc-only The api_token mint surface does not exist on the wire in `buzz-relay`: no `/tokens` route in `router.rs:52-79` (verified by hand on PR head), no `tokens` module in `crates/buzz-relay/src/api/`. The 792-line self-service minting endpoint that existed pre-rewrite (sprout-relay PR #37, commit `f84da74d3`) was deliberately not ported. Api_tokens are *consumed* (not minted) by the Blossom upload path at `media.rs:638`. This means "mint in A, present to B" has no wire precondition — a wire-driven row would test a contract with no entry point. The honest shape is doc-only, mirroring `audit_log`: where audit proves the *output* surface does not exist on the wire, api_tokens proves the *input* surface does not. Both are strictly stronger isolation claims than a wire-denied assertion. The `(community_id, token_hash)` fence itself is directly proven at the storage layer (where direct Postgres access is in-convention): * `crates/buzz-db/src/api_token.rs:425 lookup_by_hash_is_scoped_to_community` — same hash in A and B, A-scoped lookup returns A only. * `crates/buzz-db/src/api_token.rs:488 active_lookup_by_hash_is_scoped_to_community` — mirror for the revoked-filter variant. Plus the consumer fence: `media.rs:638` calls the scoped DB lookup with `tenant.community()` derived from request host *before* token resolution (`media.rs:97` comment names the row-44 fence explicitly). # `nip98_replay_seenset_is_shared_and_community_scoped` — wire-driven Load-bearing wire claim: within-community replay rejection. Sign a NIP-98 event E for A's `u=`, POST to A → 200. POST again → 401 with a body that names replay detection. The proof that the shared (cross-pod) seen-set is in the request path at all — without it, any pod would re-honor a spent NIP-98 event. Mutate-bite: `check_nip98_replay → noop` in `bridge.rs:79` (return `Ok(())` without consulting the guard). Under mutation, second POST goes 200 instead of 401. Test asserts the failure with named assertion message pointing at the mutate-bite handle, so a future reader sees what would have been caught. Cross-community independence is a *tripwire*, not a bite: sign an independent NIP-98 event E' for B's `u=` (different event_id by u-tag canonicalization divergence), POST to B → 200 even though E was spent in A. Catches future namespace-globalization regressions (key truncation, u-normalization collapse) that would break the spend-spread, on top of the unit-layer proof at `crates/buzz-auth/src/nip98_replay.rs:163 key_isolates_communities_for_same_event_id` (which the substrate's own doc-comment names as "belt-and-suspenders"). The prefix-drop mutation considered earlier turned out to be vacuous against natural wire traffic: u-tag divergence across communities makes event_ids already community-distinct, so dropping the community prefix from `nip98_replay_key` does not collapse natural traffic into a shared slot. A same-event_id-different-community wire collision can't be constructed because u-host (`verify_bridge_auth`) rejects with 401 before the replay check runs. That artificial property is proven at the unit layer; the wire layer asserts the load-bearing per-call replay rejection. # Bar * `cargo check -p buzz-test-client --tests`: clean. * `cargo clippy -p buzz-test-client --tests -- -D warnings`: clean. * `cargo fmt -p buzz-test-client -- --check`: clean. * Default test run (no `--ignored`): 1 passed (doc-only `#[test]`), 16 ignored (live rows). * `--ignored api_tokens` against fresh `:3300` harness (`BUZZ_GIT_CONFORMANCE_PROBE=false`): GREEN. * Mutate-bite `check_nip98_replay → noop` on `bridge.rs:79`, rebuild, restart: RED on the within-A second-POST assertion ("second POST to A with the same NIP-98 event MUST be rejected as replay (got 200 OK)"), `left: 200, right: 401`. Restored byte-identical, GREEN again. Base: PR #1321 head `ae703c5c8`. Test-only diff: zero lines in `buzz-db`, `buzz-relay`, or `buzz-auth` production code. Matrix 8/14. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
tlongwell-block
pushed a commit
that referenced
this pull request
Jun 27, 2026
…98_replay assertion Follow-up to da6051f per Quinn's cold-read (event 4529860195007964...). The within-community replay assertion `assert_eq!(second_a.status(), UNAUTHORIZED)` pins the 401 status code rather than checking the body, because the system has defense-in-depth across two layers with distinct rejection signatures: * auth-layer replay check (`check_nip98_replay`) — rejects with 401 + body "NIP-98: replay detected". * storage-layer dedup (`events` PK `ON CONFLICT DO NOTHING` in `ingest_event`) — accepts with 200 + body `accepted: false, message: "duplicate"`. Both reject a duplicate, but only the 401 path proves the seen-set is in the request path. A body-only check like `!accepted` would pass under a noop'd `check_nip98_replay` because storage-dedup still 200-accepted-false's the second post — the bite would go vacuous against the layer the obligation actually names ("seen-set in the request path"). Adds: * Inline `//` comment block immediately above the `assert_eq!` naming the two layers, their distinct status signatures, and why the 401 expectation is the load-bearing-layer discriminator. Explicitly tells a future reader not to weaken to `!accepted` for "simpler reading." * Extended assertion message: when the test fails, the panic message now names both layers and which one the 401 proves, so a future debugger sees the architectural property without reading the doc-comment. Generalized principle (per Quinn): when a system has defense-in-depth across layers with different status-code signatures on rejection, the assertion should pin the status code from the load-bearing layer, not any rejection. Held in the row's doc-comment (not the shared discipline slug) per Quinn's stopping rule — this is a deeper instance of slug rule #2's defense-in-depth class, not a new spine entry. Bar: * Comment/string-only diff: 21 lines (+19 / −2), zero runtime behavior change — verified by inspection (`git diff` shows only comments and string-literal extensions). * `cargo check -p buzz-test-client --tests`: clean. * `cargo clippy -p buzz-test-client --tests -- -D warnings`: clean. * `cargo fmt -p buzz-test-client -- --check`: clean. * Default `cargo test ... api_tokens` (no `--ignored`): doc-only `#[test]` still passes; wire-driven still `#[ignore]`-skipped. * No live mutate-bite re-run needed: the runtime path of the wire-driven test is byte-identical (only strings/comments touched), and the mutate-bite at da6051f was already RED-on-right-assertion by Sami's hands at :3300 and Eva's hands at her :3300 (event 9e9050cd44d6...). The follow-up makes the *reason* the bite bites discoverable to a future reader; it does not change *whether* the bite bites. Base: PR #1321 head `da6051fdb`. Test-only diff. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Community-scope media metadata sidecars so shared CAS bytes are only readable when the request tenant owns the sidecar. Community-scope git repo pointer cells while keeping immutable pack and manifest CAS shared, and bind git NIP-98 URL verification to the server-resolved request host. Also isolate git repo path config validation so concurrent tests do not leak BUZZ_GIT_REPO_PATH mutations into unrelated Config::from_env() callers. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…98_replay assertion Follow-up to da6051f per Quinn's cold-read (event 4529860195007964...). The within-community replay assertion `assert_eq!(second_a.status(), UNAUTHORIZED)` pins the 401 status code rather than checking the body, because the system has defense-in-depth across two layers with distinct rejection signatures: * auth-layer replay check (`check_nip98_replay`) — rejects with 401 + body "NIP-98: replay detected". * storage-layer dedup (`events` PK `ON CONFLICT DO NOTHING` in `ingest_event`) — accepts with 200 + body `accepted: false, message: "duplicate"`. Both reject a duplicate, but only the 401 path proves the seen-set is in the request path. A body-only check like `!accepted` would pass under a noop'd `check_nip98_replay` because storage-dedup still 200-accepted-false's the second post — the bite would go vacuous against the layer the obligation actually names ("seen-set in the request path"). Adds: * Inline `//` comment block immediately above the `assert_eq!` naming the two layers, their distinct status signatures, and why the 401 expectation is the load-bearing-layer discriminator. Explicitly tells a future reader not to weaken to `!accepted` for "simpler reading." * Extended assertion message: when the test fails, the panic message now names both layers and which one the 401 proves, so a future debugger sees the architectural property without reading the doc-comment. Generalized principle (per Quinn): when a system has defense-in-depth across layers with different status-code signatures on rejection, the assertion should pin the status code from the load-bearing layer, not any rejection. Held in the row's doc-comment (not the shared discipline slug) per Quinn's stopping rule — this is a deeper instance of slug rule #2's defense-in-depth class, not a new spine entry. Bar: * Comment/string-only diff: 21 lines (+19 / −2), zero runtime behavior change — verified by inspection (`git diff` shows only comments and string-literal extensions). * `cargo check -p buzz-test-client --tests`: clean. * `cargo clippy -p buzz-test-client --tests -- -D warnings`: clean. * `cargo fmt -p buzz-test-client -- --check`: clean. * Default `cargo test ... api_tokens` (no `--ignored`): doc-only `#[test]` still passes; wire-driven still `#[ignore]`-skipped. * No live mutate-bite re-run needed: the runtime path of the wire-driven test is byte-identical (only strings/comments touched), and the mutate-bite at da6051f was already RED-on-right-assertion by Sami's hands at :3300 and Eva's hands at her :3300 (event 9e9050cd44d6...). The follow-up makes the *reason* the bite bites discoverable to a future reader; it does not change *whether* the bite bites. Base: PR #1321 head `da6051fdb`. Test-only diff. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…mments The Typesense search backend was replaced by Postgres FTS in this rewrite, but two vestiges remained in live code: - `Config.typesense_url` / `Config.typesense_key` were still parsed from `TYPESENSE_URL` / `TYPESENSE_API_KEY` and stored on the struct, yet read nowhere outside config.rs. Removed the fields, env parsing, and struct init. - Several doc/inline comments still described the search path as hitting Typesense (req.rs NIP-50 handler, bridge.rs post-filter rationale). The behavior is unchanged but the engine is Postgres FTS; corrected the naming so the comments match the code. Kept genuinely historical references intact (event.rs note that the old index_event worker is gone; query.rs/schema provenance of the legacy __global__ sentinel and the FTS migration). cargo check -p buzz-relay green. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…chart Typesense is no longer a relay dependency (search is Postgres FTS), so the chart no longer provisions, wires, validates, or secrets a Typesense backend: - Deleted templates/quickstart-typesense.yaml (in-cluster eval Deployment). - deployment.yaml: dropped TYPESENSE_URL / TYPESENSE_API_KEY env from the relay. - secret-chart.yaml: dropped the Typesense URL/key composition block. - _helpers.tpl: removed buzz.typesenseFullname / buzz.typesenseUrl defines. - _validate.tpl: removed the 'Typesense source must exist' fail-guard. - values.yaml / values.schema.json: removed the typesense.* value block and schema, and the TYPESENSE_* entries from the existingSecret key docs. - NOTES.txt: dropped Typesense from the quickstart/production profile output. - ci/, examples/, Chart.yaml: dropped Typesense from the quickstart render scenario, the GitOps samples, and the chart description. - tests/: removed the three Typesense-specific helm-unittest cases and the now-meaningless typesense.* set/fixture boilerplate. Verified: helm-unittest 25/25 green (was 28; -3 Typesense cases); helm lint clean; ci/quickstart + ha + production-existing-secret render matrices all template cleanly after helm dependency build. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Search is Postgres FTS; the relay no longer reads TYPESENSE_URL/_API_KEY, so
the local stacks no longer provision or wait on a Typesense container:
- docker-compose.yml: removed the typesense service + typesense-data volume.
- deploy/compose/compose.yml: removed the typesense service, its volume, the
relay's TYPESENSE_URL env, and the relay depends_on: typesense health gate;
compose.dev.yml: removed the typesense port/CORS override.
- scripts/start-relay-for-tests.sh: stopped bringing up + health-waiting on
the typesense container and stopped exporting TYPESENSE_* to the relay.
- scripts/{dev-setup,run-tests}.sh: removed TYPESENSE_* env export/printout.
- scripts/dev-reset.sh, e2e-*.sh, compose READMEs/run.sh: doc-string cleanup.
Verified: docker compose config renders cleanly for docker-compose.yml and for
deploy/compose (compose.yml + compose.dev.yml), zero typesense in the rendered
output; all touched shell scripts pass bash -n.
Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Migrate prose and doc-comments to describe the Postgres FTS backend that replaced Typesense: README architecture diagram (3 boxes, Postgres now "events + FTS search"), ARCHITECTURE.md buzz-search section rewritten to the real API (SearchService::new(pool), search(&SearchQuery), ChannelScope) and the search_tsv generated-column mechanism (CASE WHEN kind IN (1059,30300,30622) THEN NULL, idx_events_search_tsv GIN), CONTRIBUTING step-6, VISION, AGENTS, TESTING (both), and the chart README. Comment-only edits in desktop and test-client files; drop the dead reindex-kind0 Justfile recipe (its binary no longer exists). Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
The #1285 conformance checklist and formal spec described the search isolation obligation against the Typesense backend that no longer exists. Keep the obligation intact (search query carries community_id; results never cross tenants; refetch is (community_id, event_id)) and restate the mechanism in terms of the Postgres FTS implementation #1321 ships: the events.search_tsv generated tsvector column, GIN index, community_id filter BitmapAnd-ed with the @@ probe, and ChannelScope::ChannelLessOnly in place of the __global__ sentinel. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
The relay-e2e and desktop-e2e jobs started Typesense via `docker compose up -d ... typesense ...`, waited on a `buzz-typesense` healthcheck, and passed TYPESENSE_URL/TYPESENSE_API_KEY to the relay. The service was removed from compose, so `docker compose up` failed with 'no such service: typesense' and took all four E2E jobs down at setup time. Remove the service from both job blocks, drop the healthcheck wait, and drop the two now-dead relay env vars (the relay stopped reading them when the config fields were removed in f1f6bbf). Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
tlongwell-block
pushed a commit
that referenced
this pull request
Jun 27, 2026
…ibling) NIP-42 sibling of the NIP-98 host-binding fix in be9d26e. `handle_auth` was verifying the AUTH event's `relay` tag against `state.config.relay_url` (one static string per deployment), so under multi-tenant: (a) An AUTH event signed against community A's host could be accepted on a connection whose tenant resolved to community B (cross-host token reuse — the same hole `nip98_expected_url` closed on the HTTP side). (b) Every legitimate connection whose tenant host wasn't the single configured one would be rejected (the wall Quinn hit bringing up `search_fts`'s two-host harness). Add `nip42_expected_relay_url(config_relay_url, &tenant)` next to `nip98_expected_url` in `bridge.rs` — scheme from config (preserves `ws://`/`wss://` TLS posture), host from `tenant.host()` (request-resolved, never client-supplied). Thread it at `handlers/auth.rs:73` so `verify_auth_event` receives the per-tenant URL. Tests (mirror `nip98_expected_url_*` shape, `bridge.rs:1303-1432`): * `verify_nip42_rejects_event_signed_for_wrong_communitys_host` — attacker on B-bound connection signs AUTH matching `config.relay_url` (=A's host); fix rejects with `RelayUrlMismatch`. Bites the exact "reverted to config host" regression. * `verify_nip42_accepts_event_signed_for_matching_host` — positive control: matching-host AUTH verifies. * `nip42_expected_relay_url_uses_tenant_host_not_config_host` — pins host-from-tenant in both directions. * `nip42_expected_relay_url_derives_scheme_from_config` — pins `ws://` ↔ `wss://` scheme passthrough. Plus a live two-host integration test (`crates/buzz-test-client/tests/nip42_host_binding_live.rs`, `#[ignore]`): two seeded communities at `a.localhost:3100`/`b.localhost:3100`, raw WS AUTH with forged `relay` tag, four cases. Under the pre-fix mutation (`config.relay_url` verbatim) with `RELAY_URL=ws://a.localhost:3100`: `nip42_matching_host_accepted_b` failed (`auth-required: verification failed` — legit B traffic blocked) and `nip42_cross_host_rejected_a_relay_tag_on_b_connection` failed (relay accepted A-tag on B connection — the hole). Both green after restore. Bar: * `cargo test -p buzz-relay -p buzz-auth -- --test-threads=1`: buzz-auth 45/45, buzz-relay 403/403 + 1 integration. * `cargo test -p buzz-test-client --test nip42_host_binding_live -- --ignored`: 4/4 against live two-host relay. * `cargo clippy -p buzz-relay -p buzz-auth -p buzz-test-client --tests -- -D warnings`: clean. * Mutate-bite (helper body → `config_relay_url.to_string()`): unit tests 3 RED, live tests 2 RED — exact pre-fix bug shape — both restored byte-identical to green. Base: 4b6e1e4 (PR #1321 tip). Row-zero priority — gates every WS-authed conformance matrix row. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
tlongwell-block
pushed a commit
that referenced
this pull request
Jun 27, 2026
…ken half + wire-driven NIP-98 replay half Fills both `pending_lane` stubs in `mod api_tokens_nip98_replay`: # `token_minted_in_a_does_not_authorize_in_b` — doc-only The api_token mint surface does not exist on the wire in `buzz-relay`: no `/tokens` route in `router.rs:52-79` (verified by hand on PR head), no `tokens` module in `crates/buzz-relay/src/api/`. The 792-line self-service minting endpoint that existed pre-rewrite (sprout-relay PR #37, commit `f84da74d3`) was deliberately not ported. Api_tokens are *consumed* (not minted) by the Blossom upload path at `media.rs:638`. This means "mint in A, present to B" has no wire precondition — a wire-driven row would test a contract with no entry point. The honest shape is doc-only, mirroring `audit_log`: where audit proves the *output* surface does not exist on the wire, api_tokens proves the *input* surface does not. Both are strictly stronger isolation claims than a wire-denied assertion. The `(community_id, token_hash)` fence itself is directly proven at the storage layer (where direct Postgres access is in-convention): * `crates/buzz-db/src/api_token.rs:425 lookup_by_hash_is_scoped_to_community` — same hash in A and B, A-scoped lookup returns A only. * `crates/buzz-db/src/api_token.rs:488 active_lookup_by_hash_is_scoped_to_community` — mirror for the revoked-filter variant. Plus the consumer fence: `media.rs:638` calls the scoped DB lookup with `tenant.community()` derived from request host *before* token resolution (`media.rs:97` comment names the row-44 fence explicitly). # `nip98_replay_seenset_is_shared_and_community_scoped` — wire-driven Load-bearing wire claim: within-community replay rejection. Sign a NIP-98 event E for A's `u=`, POST to A → 200. POST again → 401 with a body that names replay detection. The proof that the shared (cross-pod) seen-set is in the request path at all — without it, any pod would re-honor a spent NIP-98 event. Mutate-bite: `check_nip98_replay → noop` in `bridge.rs:79` (return `Ok(())` without consulting the guard). Under mutation, second POST goes 200 instead of 401. Test asserts the failure with named assertion message pointing at the mutate-bite handle, so a future reader sees what would have been caught. Cross-community independence is a *tripwire*, not a bite: sign an independent NIP-98 event E' for B's `u=` (different event_id by u-tag canonicalization divergence), POST to B → 200 even though E was spent in A. Catches future namespace-globalization regressions (key truncation, u-normalization collapse) that would break the spend-spread, on top of the unit-layer proof at `crates/buzz-auth/src/nip98_replay.rs:163 key_isolates_communities_for_same_event_id` (which the substrate's own doc-comment names as "belt-and-suspenders"). The prefix-drop mutation considered earlier turned out to be vacuous against natural wire traffic: u-tag divergence across communities makes event_ids already community-distinct, so dropping the community prefix from `nip98_replay_key` does not collapse natural traffic into a shared slot. A same-event_id-different-community wire collision can't be constructed because u-host (`verify_bridge_auth`) rejects with 401 before the replay check runs. That artificial property is proven at the unit layer; the wire layer asserts the load-bearing per-call replay rejection. # Bar * `cargo check -p buzz-test-client --tests`: clean. * `cargo clippy -p buzz-test-client --tests -- -D warnings`: clean. * `cargo fmt -p buzz-test-client -- --check`: clean. * Default test run (no `--ignored`): 1 passed (doc-only `#[test]`), 16 ignored (live rows). * `--ignored api_tokens` against fresh `:3300` harness (`BUZZ_GIT_CONFORMANCE_PROBE=false`): GREEN. * Mutate-bite `check_nip98_replay → noop` on `bridge.rs:79`, rebuild, restart: RED on the within-A second-POST assertion ("second POST to A with the same NIP-98 event MUST be rejected as replay (got 200 OK)"), `left: 200, right: 401`. Restored byte-identical, GREEN again. Base: PR #1321 head `ae703c5c8`. Test-only diff: zero lines in `buzz-db`, `buzz-relay`, or `buzz-auth` production code. Matrix 8/14. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
tlongwell-block
pushed a commit
that referenced
this pull request
Jun 27, 2026
…98_replay assertion Follow-up to da6051f per Quinn's cold-read (event 4529860195007964...). The within-community replay assertion `assert_eq!(second_a.status(), UNAUTHORIZED)` pins the 401 status code rather than checking the body, because the system has defense-in-depth across two layers with distinct rejection signatures: * auth-layer replay check (`check_nip98_replay`) — rejects with 401 + body "NIP-98: replay detected". * storage-layer dedup (`events` PK `ON CONFLICT DO NOTHING` in `ingest_event`) — accepts with 200 + body `accepted: false, message: "duplicate"`. Both reject a duplicate, but only the 401 path proves the seen-set is in the request path. A body-only check like `!accepted` would pass under a noop'd `check_nip98_replay` because storage-dedup still 200-accepted-false's the second post — the bite would go vacuous against the layer the obligation actually names ("seen-set in the request path"). Adds: * Inline `//` comment block immediately above the `assert_eq!` naming the two layers, their distinct status signatures, and why the 401 expectation is the load-bearing-layer discriminator. Explicitly tells a future reader not to weaken to `!accepted` for "simpler reading." * Extended assertion message: when the test fails, the panic message now names both layers and which one the 401 proves, so a future debugger sees the architectural property without reading the doc-comment. Generalized principle (per Quinn): when a system has defense-in-depth across layers with different status-code signatures on rejection, the assertion should pin the status code from the load-bearing layer, not any rejection. Held in the row's doc-comment (not the shared discipline slug) per Quinn's stopping rule — this is a deeper instance of slug rule #2's defense-in-depth class, not a new spine entry. Bar: * Comment/string-only diff: 21 lines (+19 / −2), zero runtime behavior change — verified by inspection (`git diff` shows only comments and string-literal extensions). * `cargo check -p buzz-test-client --tests`: clean. * `cargo clippy -p buzz-test-client --tests -- -D warnings`: clean. * `cargo fmt -p buzz-test-client -- --check`: clean. * Default `cargo test ... api_tokens` (no `--ignored`): doc-only `#[test]` still passes; wire-driven still `#[ignore]`-skipped. * No live mutate-bite re-run needed: the runtime path of the wire-driven test is byte-identical (only strings/comments touched), and the mutate-bite at da6051f was already RED-on-right-assertion by Sami's hands at :3300 and Eva's hands at her :3300 (event 9e9050cd44d6...). The follow-up makes the *reason* the bite bites discoverable to a future reader; it does not change *whether* the bite bites. Base: PR #1321 head `da6051fdb`. Test-only diff. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
tlongwell-block
pushed a commit
that referenced
this pull request
Jun 27, 2026
The #1285 conformance checklist and formal spec described the search isolation obligation against the Typesense backend that no longer exists. Keep the obligation intact (search query carries community_id; results never cross tenants; refetch is (community_id, event_id)) and restate the mechanism in terms of the Postgres FTS implementation #1321 ships: the events.search_tsv generated tsvector column, GIN index, community_id filter BitmapAnd-ed with the @@ probe, and ChannelScope::ChannelLessOnly in place of the __global__ sentinel. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
tlongwell-block
pushed a commit
that referenced
this pull request
Jun 27, 2026
NIP-43 admission confinement (Max review #3 on PR #1321). The `relay_members` table is keyed `(community_id, pubkey)`, but every DB access keyed on `pubkey` alone. In closed mode a pubkey admitted to community A was therefore admitted to community B — the exact M9 mutation #1285 targets. Request-path scoping: - thread `CommunityId` through all 9 `buzz-db::relay_members` functions and their `Db` wrappers; every query/insert/list/bootstrap now binds `community_id` (PK `ON CONFLICT (community_id, pubkey)`, WHERE clauses carry community). - `check_/enforce_relay_membership` take the server-resolved community; pass `tenant.community()` at every entrypoint that already binds a tenant: bridge (events/query/count), media upload, git transport, audio handle, WS auth (`conn.tenant`), mesh `require_mesh_member` (connect + status), relay-admin events, leave-request ingest, identity-archive consent, NIP-43 list publish, and the buzz-admin CLI (via `resolve_admin_tenant`). Startup seeding (the bootstrap half of the same fix): membership backfill and owner bootstrap previously ran with no community. They now run against the deployment's own community, seeded via `ensure_configured_community` under the *same* normalized host that live request resolution derives (`relay_url_authority` → `normalize_host`, now `pub`), so the bootstrapped owner lands in exactly the community requests for this host resolve to. An unparseable `relay_url` fails fast when membership is enforced rather than seeding an unreachable empty-host community. Regression (Postgres-backed, `#[ignore]`): - `membership_is_confined_to_its_community`: A admits a pubkey, B does not — `is_/get_/list_relay_members` confine it to A. - `owner_bootstrap_is_confined_to_its_community`: owner bootstrapped in A is not a member of B. cargo test -p buzz-db -p buzz-relay -p buzz-admin green; both new tests pass against live Postgres; clippy clean on all three crates. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
The local-echo dedup cache (`AppState.local_event_ids`) was keyed on the bare Nostr event id. The same event id can legitimately exist in two communities (channel-less events; same-channel-UUID/same-`h` events across tenants), so a local publish of event X in community A would suppress delivery of a *distinct* same-id event arriving via Redis for community B for the 60s TTL — a cross-community non-interference violation. Key the cache on `(CommunityId, [u8; 32])` instead. `mark_local_event` now takes the community; all 11 callers pass the tenant already in scope (`tenant.community()` / `conn.tenant.community()`), and the Redis-subscriber skip/invalidate checks compare the pair. The community was already extracted at the skip site (`handlers/event.rs`) and used by the scoped fan-out right below it — it was simply absent from the dedup key. Regression: `local_echo_suppression_is_scoped_to_its_community` marks A/X locally, feeds B/X through `fan_out_pubsub_event`, and asserts the B-bound subscriber still receives it. Fails on the bare-id key (delivery dropped), passes once the key carries the community. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
The NIP-ER reminder scheduler published the reminder event to Redis first and only then claimed it (`claim_due_reminder`), treating a duplicate publish as harmless because subscribers dedup by event id. That is the old unsafe ordering: across N pods a due reminder could be published more than once, violating the claim-before-side-effect rule for periodic producers (every side effect must be claimed exactly once, not deduped after the fact). Rewire to claim-before-publish using the stamp-guarded primitives that were already built and wired into buzz-db but had zero callers: `claim_due_reminder_with_stamp` (event.rs:1186) and `release_due_reminder` (event.rs:1213). Each attempt mints a unique per-pod stamp; the scheduler claims first, publishes only on a winning claim (`Ok(true)`) and `continue`s on the loser (`Ok(false)`) so the loser never produces the side effect, and releases its own claim on publish failure via compare-and-clear so the reminder is redeliverable next tick. `events.delivered_at` is only ever read as a NULL/non-NULL sentinel (due-reminder query guard + partial index), never as a wall-clock value, so an opaque stamp is safe to store there. The unused convenience wrapper `claim_due_reminder` (seconds stamp) is left in place as public API; the scheduler no longer uses it. Tests (buzz-db, Postgres-backed, verified locally against the dev DB): - claim_due_reminder_is_won_by_exactly_one_of_two_racing_pods: two pods, two stamps, one reminder -> exactly one wins; the single winning claim is the proof of exactly one publish side effect. - release_due_reminder_rolls_back_only_the_matching_stamp: a wrong-stamp release is a no-op (cannot clear another pod's claim); a matching-stamp release makes the reminder reclaimable for retry. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
NIP-43 admission confinement (Max review #3 on PR #1321). The `relay_members` table is keyed `(community_id, pubkey)`, but every DB access keyed on `pubkey` alone. In closed mode a pubkey admitted to community A was therefore admitted to community B — the exact M9 mutation #1285 targets. Request-path scoping: - thread `CommunityId` through all 9 `buzz-db::relay_members` functions and their `Db` wrappers; every query/insert/list/bootstrap now binds `community_id` (PK `ON CONFLICT (community_id, pubkey)`, WHERE clauses carry community). - `check_/enforce_relay_membership` take the server-resolved community; pass `tenant.community()` at every entrypoint that already binds a tenant: bridge (events/query/count), media upload, git transport, audio handle, WS auth (`conn.tenant`), mesh `require_mesh_member` (connect + status), relay-admin events, leave-request ingest, identity-archive consent, NIP-43 list publish, and the buzz-admin CLI (via `resolve_admin_tenant`). Startup seeding (the bootstrap half of the same fix): membership backfill and owner bootstrap previously ran with no community. They now run against the deployment's own community, seeded via `ensure_configured_community` under the *same* normalized host that live request resolution derives (`relay_url_authority` → `normalize_host`, now `pub`), so the bootstrapped owner lands in exactly the community requests for this host resolve to. An unparseable `relay_url` fails fast when membership is enforced rather than seeding an unreachable empty-host community. Regression (Postgres-backed, `#[ignore]`): - `membership_is_confined_to_its_community`: A admits a pubkey, B does not — `is_/get_/list_relay_members` confine it to A. - `owner_bootstrap_is_confined_to_its_community`: owner bootstrapped in A is not a member of B. cargo test -p buzz-db -p buzz-relay -p buzz-admin green; both new tests pass against live Postgres; clippy clean on all three crates. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
`claim_due_reminder_with_stamp` and `release_due_reminder` predicated only on `(created_at, id)`, but `events` is keyed `(community_id, created_at, id)` and the same Nostr event id — hence the same `id`/`created_at` pair — is allowed across communities. So a claim for reminder `A/X` would also mark `B/X` delivered (suppressing B's reminder), and a matching-stamp release for `A/X` would clear `B/X`. That is cross-community interference in exactly the reminder lane the claim-before-publish gate fixes; the scheduler's exactly-once-publish proof rests on this primitive being community-scoped. Thread `CommunityId` through both `event::` fns, their `Db` wrappers, and the bare `claim_due_reminder` convenience fn; every predicate is now `WHERE community_id = $1 AND created_at = $2 AND id = $3 ...`. The reminder scheduler already carries `reminder.community_id` on the `DueReminder` row (joined from `communities`), so both call sites pass it with no new tenant minting. Adds `reminder_claim_and_release_are_confined_to_their_community`: inserts one signed reminder event into communities A and B (identical id/created_at), claims A/X and asserts B/X stays claimable, then releases A/X and asserts B/X stays claimed while A/X becomes reclaimable. Green against live Postgres alongside the existing claim-race and stamp-rollback tests. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Issue 3 reopen (Max): buzz-admin's resolve_admin_tenant derived its lookup host with Url::host_str(), which drops an explicit non-default port and IPv6 brackets. For the default RELAY_URL ws://localhost:3000 the admin would look up community host `localhost` while startup seeding (and live request resolution) bind `localhost:3000` — so the admin CLI's membership writes would miss, or hit, the wrong deployment community. Lift relay_url_authority into buzz-core::tenant as the single canonical helper so the relay's host-resolution seam (startup seeding, bind_deployment_community) and the buzz-admin CLI derive a byte-identical authority: host plus explicit non-default port, IPv6 brackets preserved, default ports collapsed — exactly as normalize_host shapes an inbound Host header. The relay tenant module now `pub use`-re-exports it (no behavior change at the relay seam); buzz-admin calls it directly. Adds 4 buzz-core unit tests pinning the authority shape: non-default-port retention (localhost:3000, relay.example:8443), default-port collapse (:443/:80), IPv6 brackets ([::1]:3000), and unparseable/empty -> empty (callers fail closed on empty). Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
`workflows`, `workflow_runs`, and `workflow_approvals` are all keyed
`(community_id, id|token)`, so the same UUID/token is structurally allowed
in two communities — exactly like channels and events. But the execution
and approval spine still fetched, listed, mutated, and posted by bare id,
so a webhook/manual trigger or NIP-09 deletion in community B could load,
drive, or erase community A's colliding workflow, and workflow side effects
were published under the deployment/default tenant instead of the run's own
community. This threads the owning community through every request-scoped
and run-scoped path so each lookup, write, and side effect is confined to
its tenant.
- `ActionSink::send_message` now takes the run's `community_id` as its first
parameter. `RelayActionSink` drops `bind_deployment_community(relay_url)` —
the Issue-4 root cause — and instead resolves the run community's host via
`lookup_community_host` to form a complete `TenantContext::resolved`, fail
closed if the community is unmapped. A workflow in B now posts into B.
- Executor (`dispatch_action`, `execute_run`, `execute_from_step`,
`execute_steps`) and engine (`finalize_run`, `on_event`) carry the run's
community; every `get_workflow_run` / `get_workflow` / `send_message` and
the post-store `on_event` call (from `dispatch_persistent_event`, which has
the bound `tenant`) are scoped. The interval `last_fired` DashMap is keyed
`(CommunityId, Uuid)` so duplicate workflow UUIDs across communities cannot
cross-suppress in memory.
- Webhook `/hooks/{id}` now binds its community from the request Host before
any lookup (`bind_community`), then `get_workflow(community, id)`. The host
— not the workflow row — determines the tenant, so a request to A's host
can only reach A's workflows; unmapped host and not-found both fail closed
with the same generic 404.
- WS manual trigger and `create_workflow` use `tenant.community()` as the
authoritative owner. `create_workflow` no longer resolves the community via
the ambiguous `community_of_channel(channel_id)`; it verifies the channel
exists *inside* the bound community via scoped `get_channel` (the same
guarantee the composite FK enforces, surfaced as a clean rejection).
- Approval grant/deny/resume handlers and the `buzz-db` approval methods
(`get_approval`, `get_approval_by_stored_hash`, `get_run_approvals`,
`update_approval`, `update_approval_by_stored_hash`, `create_approval`)
are scoped by community; `create_approval`'s INSERT now includes the
`community_id` NOT-NULL column it previously omitted. NIP-09 a-tag workflow
deletion (`delete_workflow`, `find_workflow_by_owner_and_name`) is scoped
to the request tenant.
Adds three `#[ignore]` Postgres regressions in `buzz-db::workflow`, each
verified green against live PG and red when the `community_id` predicate is
dropped: `workflow_lookup_is_confined_to_its_community` (dup workflow+channel
UUID in A/B; scoped get/list resolve only the bound community's row, cross
lookup is NotFound), `workflow_delete_is_confined_to_its_community` (deleting
A/id leaves B/id intact), and `approval_is_confined_to_its_community` (same
token in A/B; granting A leaves B pending). Full `cargo test -p buzz-db
-p buzz-workflow -p buzz-relay` green, clippy clean on the trio.
Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
The scheduled-workflow fire path never called the durable claim primitives (`claim_scheduled_workflow_fire` had zero callers outside the DB wrappers and tests), so with N relay pods every pod that saw a due cron/interval row created a run and executed the side effect — a multi-pod duplicate-run bug. And the claim primitive itself was still keyed by bare `workflow_id` (`WHERE w.id = $1`) despite the schema's `(community_id, id)` workflow identity: with duplicate workflow UUIDs across communities (which the schema explicitly allows and the Issue-4 confinement tests pin), a single `INSERT ... SELECT` matched every community's row and fanned one claim across all of them. Scope the claim to its community and wire the claim->run boundary into the scheduler: - `claim_scheduled_workflow_fire` takes `CommunityId` and binds `WHERE w.community_id = $1 AND w.id = $2`. The community is server provenance — the `workflow.community_id` from the global scan, never client input. This reverses the earlier S1 "resolve-from-id-alone" lock, which was written against a globally-unique-`workflow_id` assumption the final `(community_id, id)` schema does not hold (and which is unimplementable on that key). The surviving invariant is "the claim community is server provenance, not client-controlled." - `WorkflowEngine::run` now claims before `create_workflow_run`; the loser skips before any run creation or side effect. The claim anchor `scheduled_for` is deterministic across pods: the cron's own scheduled instant (`cron_fire_instant`, not `now`) or the interval bucket boundary (`interval_fire_instant`, floor to the interval). The interval anchor is seeded from `latest_scheduled_workflow_fire` on the first tick after restart so a process bounce can't double-fire within an interval. `attach_scheduled_workflow_run` links the won claim to its run for ops/audit forensics. Tests: - Rewrite the stale S1 comment block and `claim_for_workflow_in_other_ community_no_ops` (which encoded the now-false globally-unique assumption) into `claim_confined_to_its_community`: a dup workflow UUID in A and B claims independently (claiming A/id leaves B/id claimable). Proven RED on the bare-`id` regression. - `concurrent_same_window_claims_exactly_one_wins` and the prune-anchor test updated to the scoped signature. - New `cron_fire_instant` / `interval_fire_instant` unit tests pin the deterministic, drift-stable claim anchors. Full `cargo test -p buzz-db -p buzz-workflow -p buzz-relay` green vs live Postgres (serial); clippy clean on the trio. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…ttach audit link The scheduler's post-run `attach_scheduled_workflow_run` does `SET workflow_run_id`, but neither schema/schema.sql nor the initial migration declared that column, so every won scheduled claim would create+run, then the best-effort attach would hit `column "workflow_run_id" does not exist` (PG 42703) and warn forever — the audit link could never populate. Found by Max in cold review of 778a5d28c. - Add nullable `workflow_run_id UUID` to scheduled_workflow_fires (schema + migration) with a composite FK `(community_id, workflow_run_id) REFERENCES workflow_runs (community_id, id)`. The FK uses ON DELETE NO ACTION, not SET NULL: community_id is shared with the claim PK and is NOT NULL, so SET NULL is unimplementable (verified against live PG: it raises a NOT NULL violation mid-cascade). NO ACTION blocks a delete of a still-linked run cleanly; workflow_runs are not pruned today regardless. - Rewrite the stale scheduled-fires schema comment that still claimed community is 'resolved server-side from workflow_id, never a caller-supplied claim parameter' — contradicted by the S1 reconciliation: community is server provenance from list_all_enabled_workflows(), passed explicitly, since id is not globally unique. - Surface the interval-anchor read failure with a warn! instead of unwrap_or(None) swallowing it (still fail-closed: a missing anchor suppresses the tick and retries). - Add attach_links_run_to_claim_and_is_idempotent: proves the column populates on attach and the IS NULL guard makes a second attach a no-op. Proven RED against the pre-migration schema (the exact 42703 error), GREEN after — the regression that would have caught this gap. Verified vs live Postgres, serial: buzz-db 110 / buzz-workflow 145 / buzz-relay 414, 0 failed; clippy -D warnings clean on the trio. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
A brand-new interval workflow on a cold engine has no in-memory last_fired entry and no prior durable claim, so the scheduler resolves last = None. interval_should_fire then reads last = now and suppresses the tick (correct: wait a full interval), but the in-memory anchor is only written AFTER a won claim, and no claim is attempted until the prefilter passes. Every subsequent tick repeats with last = None, so the workflow suppresses forever. Extract interval_prefilter_should_fire (free fn over the last_fired map + a thin &self wrapper): on the cold-start None suppress path it seeds the anchor to now so the next tick counts from a real anchor and fires after one interval. It seeds ONLY when last was None; an existing Some anchor is mid-interval and must elapse on its own, so it is never advanced. A due/firing tick passes through without seeding (the post-claim path owns that write). Unit tests (no Db/Postgres; pure in-memory anchor state): - cold start seeds then fires after one interval - mid-interval suppress does not advance an existing anchor - a due fire passes through without seeding Caught by Max in cold review of the scheduled-workflow lane; predates this branch's claim work but lives on the exact lane being cleared. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Apply cargo fmt to the 11 files touched by the 9 reviewed multi-tenant fix commits. CI Rust Lint (just fmt-check) flagged formatting drift my local clippy --lib --tests run did not exercise; base a7d3817 was fmt-clean, so this only normalizes lines my own commits introduced. No logic change. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Replace the stubbed workflows::identical_workflow_and_approval_token row with two honest rows matching readiness: - workflow_trigger_is_community_confined: wire-live A/B isolation. Define a workflow under host A (kind:30620, h=channel + name tag), take A's server-generated workflow_id, fire it under host B (kind:46020) as a caller who is an owner-member of the *same channel UUID* in B. B must fail closed with the generic 'workflow not found' because get_workflow(host_community, id) is community-scoped (command_executor.rs:703, commit c81b893) — K's membership of U-in-B is irrelevant. Positive control: A fires its own id and is accepted, proving the B rejection is community confinement, not an untriggerable workflow. - approval_token_is_community_confined: precise pending_lane for WF-08. The grant fence (get_approval_by_stored_hash(community, hash)) is already landed, but nothing mints a pending approval over the wire yet (executor approval gate is an explicit WF-08 TODO; create_approval is test-only), so a green end-to-end approval-isolation test cannot be exercised today without faking it. Dep is WF-08, not buzz-db. Test-only; #[ignore] (needs a live two-host relay). No change to the verified multi-tenant fix set already on the PR head. Co-authored-by: Tyler Longwell <tlongwell@block.xyz> Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Multi-tenant Buzz relay —
community_idas a first-class, server-resolved keyMakes
community_ida first-class, server-resolved key on every scoped row: the relay derives a connection's tenant from durable data (the request host →communitiesrow), never from caller input, and threads a&TenantContextthrough every scoped DB read and every Redis publish. This is the foundation for hosting Buzz multi-tenant on shared infra with provable cross-community isolation, checked against the machine-proven safety spec landed in #1285.The fence
TenantContextcan only be minted on the host-resolution path (bind_community/ the reaper's per-rowTenantContext::resolved). Everywhere downstream takes&TenantContextby reference and reads it — nothing else constructs one from client input. Read-path caches takeCommunityId; write/invalidate publishers take&TenantContext(the Redis topic key needs the host).What's in this PR
Path-partitioned lanes, each scoping a subsystem to its tenant:
TenantContext+CommunityId, the server-resolved tenant fence every other lane threads.community_id-native schema,EventQuery::for_community, by-id/by-channel reads scoped, reaper RETURNING(community, host)per archived row, archived-identities composite-PK fix,api_tokenslookups scoped.u-URL host is per-tenant, not config-global.buzz:{community}:channel:{channel}/:global), tenant-scoped topic refcounts; presence/typing isolation covered.ChannelScopeenum closes the channel-less fence hole; privacy-kind exclusions restored at the FTS storage layer.audit_logDDL;NewAuditEntry.community_idwidened toCommunityId.get_workflow); community-scoped durable claim wired into the scheduler; cold-start anchor seeded so new schedules fire.relaytag bound to the per-tenant host; media + git substrate scoped by tenant; local-echo dedup, DM command writes, and reminder claim/release scoped by community; background loops get tenant from the DB row they act on (reaper per-row) or the configuredrelay_url(dev/CI reconciler, reminder scheduler); deployment-community cases with no connection tenant (git hook/finalize, workflow sink) resolve via the same seam.resolve_admin_tenantreadsRELAY_URLhost →lookup_community_by_host, fail-closed; membership-list publish, reconcile, and existence query scoped.pending_lanebreadcrumbs rather than faked green.reindex_kind0backfill binary and the Typesense subsystem from the Helm chart, local dev/test stacks, and CI (obsolete under Postgres FTS).Behavior changes called out for review
.well-known/nostr.jsonis now host-bound (was single-tenant offconfig.relay_url): binds community from the request Host header, falls through to empty{names,relays}on an unmapped host.BUZZ_RECONCILE_CHANNELSreconciles the configured community only (dev/CI single-community). In a multi-community deploy the safe failure mode is incomplete reconcile, never cross-tenant access.Verification
cargo check --workspacegreen;cargo fmt --checkclean; clippy clean across the touched crates (-D warnings).buzz-db,buzz-audit,buzz-relayincl.--include-ignored --test-threads=1,buzz-admincompiles).audit_records_caller_actor_not_relay_signer_for_relay_signed_event— proven to bite by reverting the fix (recorded the relay signer instead of the caller actor; restored, green).Provenance
Every commit carries a
Signed-off-bytrailer, so the branch is DCO-clean. Authorship/trailers vary across the branch: earlier subsystem lanes are authored and self-signed-off bytlongwell-block; later agent-authored fix/test lanes generally carry the responsible human operator inCo-authored-by+Signed-off-by: Tyler Longwell <tlongwell@block.xyz>, with the remaining agent-authored commits self-signed by the committing agent.Based on the #1285 safety floor (
main@2ecdcce7b). Supersedes #1259 (Typesense removal folded in here).