Skip to content

private links: model links as rows with controller-observed status#3063

Draft
jshearer wants to merge 1 commit into
masterfrom
jshearer/privatelinks_fast_follow
Draft

private links: model links as rows with controller-observed status#3063
jshearer wants to merge 1 commit into
masterfrom
jshearer/privatelinks_fast_follow

Conversation

@jshearer

@jshearer jshearer commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Reworks private networking so each configured private link is a first-class row with a system-assigned id and a data-plane-controller-owned status, replacing the flat data_planes.private_links JSON array paired with the separate *_link_endpoints output columns. This is the flow-repo and database side; the est-dry-dock change that turns on real failed status is a tracked follow-up, and this PR stands on its own without it.

This is the base of a stacked pair: #3070 sits on top and cuts the controller over to reading the table directly.

Background

The original private networking API exposed the raw primitives we already had: the private_links column as editable desired config, and the three *_link_endpoints columns as the controller's output. Building a useful UI on top surfaced two problems.

  • There is no reliable way to tie a provisioned endpoint back to the link that produced it. The only handle is matching on service_name/service_attachment, a value that lives in an opaque blob whose shape est-dry-dock owns. The controller knows the exact input-to-output correspondence at apply time but discards it before anything reaches Supabase, so every consumer reconstructs downstream a correspondence that was unambiguous at the source. If that shape ever drifts, the match silently breaks and a healthy link reads as unprovisioned.
  • We want to show real per-link status (pending, provisioned with details, failed with a reason), but we never persisted it. The only signal was "did an endpoint show up", and a single failing link tends to wedge the whole converge rather than reporting per-link.

The fix is to give each link a stable identity and persist its observed status alongside its desired config, owned by the controller, so the API reads status rather than reverse-engineering it. The desired config stays user-owned; the observed status is controller-owned, the usual reconcile split.

What changed

  • DB (new migration): data_plane_private_links table. Each row has a system id, the parent data_plane_id, provider, the polymorphic config, a generated service_identity, and controller-owned status / details / error / observed_at, unique on (data_plane_id, service_identity). RLS and grants mirror data_planes. One trigger wakes the parent's controller task on any change via internal.send_to_task (prompt convergence the column path never had) and, during the transition, projects rows back into the private_links column so the not-yet-cut-over controller keeps working. The migration backfills one row per existing array element.
  • data-plane-controller: after a converge, records per-row status by matching endpoint outputs on (provider, service_identity): present means provisioned with the endpoint stored as details, absent means pending (the temporary bridge until est-dry-dock echoes the id). It keeps reading desired links from the projected private_links column, so it has no deploy-ordering dependency on the agent-api cutover; moving its read onto the table is the stacked cutover (private-networking: controller reads links from the table, retire the projection #3070).
  • control-plane-api: privateLinks now returns a row object { id, provider, config, status, details, error, observedAt } read from the table. The wholesale updateDataPlanePrivateLinks mutation is replaced with per-link addDataPlanePrivateLink / updateDataPlanePrivateLink / removeDataPlanePrivateLink, authorized by ModifyDataPlanePrivateNetworking on the owning data plane, with the duplicate-identity case reported clearly. The config union is renamed PrivateLinkConfig in the schema; the new row type takes the PrivateLink name.

Rollout

Stacked so each step is independently safe to roll back, which matters given the prior private-links rollback from deploy ordering.

  1. This PR (additive): the table, trigger, backfill, the per-link CRUD API, and the controller writing per-row status. The controller still reads desired links from the projected private_links column, so the API and controller binaries have no ordering dependency on each other.
  2. private-networking: controller reads links from the table, retire the projection #3070 (cutover): the controller reads links from the table and the projection trigger becomes wake-only. Deploy the controller binary before its migration.
  3. Follow-up cleanup: drop the legacy private_links and *_link_endpoints columns, recreate the data_planes_overview view once, and remove the now-redundant endpoint resolvers.

Status semantics

Today the controller records pending and provisioned by matching existing endpoint outputs. failed with a reason needs est-dry-dock to emit a per-link result addressed by the id and to handle per-link errors so one bad link does not wedge the converge; the status columns are already shaped for it. That is a separate est-dry-dock change.

Testing

  • models: unit tests for service_identity() and provider().
  • control-plane-api: a snapshot of the table-backed privateLinks (AWS provisioned with details, Azure and GCP pending), plus add/update/remove and authorization tests including the duplicate-identity guard. Full suite green.
  • data-plane-controller: new queries validated against the live schema; existing suite green.
  • Regenerated the flow-client GraphQL SDL and the .sqlx offline cache.

Follow-ups (not in this PR)

  • est-dry-dock emits a per-link result addressed by the id; the controller then switches from (provider, service_identity) matching to id-addressed and populates real failed / error.
  • A controller integration test that runs a converge and asserts the per-row status write, plus pgTAP coverage for the wake trigger.

@jshearer jshearer force-pushed the jshearer/privatelinks_fast_follow branch from 21cd0a4 to 2a55fa9 Compare June 22, 2026 15:37
@jshearer jshearer changed the title private-networking: model links as rows with controller-observed status private links: model links as rows with controller-observed status Jun 22, 2026
@jshearer jshearer self-assigned this Jun 23, 2026
@jshearer jshearer force-pushed the jshearer/privatelinks_fast_follow branch from 2a55fa9 to be57daf Compare June 23, 2026 00:39
The flat `private_links` array plus separate `*_link_endpoints` output columns gave no reliable way to tie an endpoint back to its link and persisted no per-link status. Promote each link to a `data_plane_private_links` row with a stable id and controller-owned status, so the API reads status instead of reconstructing it from an opaque, externally-owned shape.

* New table + migration: per-link `config`, generated `service_identity`, controller-owned `status`/`details`/`error`, unique on `(data_plane_id, service_identity)`; a trigger wakes the controller and projects rows back into `private_links` during the transition.
* dpc records `pending`/`provisioned` by matching provisioned endpoints to links on `(provider, service_identity)`, and keeps reading desired links from the projected `private_links` column so it has no deploy-ordering dependency on the agent-api cutover.
* agent-api exposes `privateLinks` as rows and replaces the wholesale mutation with per-link add/update/remove; `failed` awaits a follow-up est-dry-dock change.
@jshearer jshearer force-pushed the jshearer/privatelinks_fast_follow branch from be57daf to eb61a36 Compare June 29, 2026 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant