Skip to content

fix(routers/ovn): leftover uplink ports#2093

Open
skrobul wants to merge 4 commits into
mainfrom
ovn-fix-multiple-uplinks
Open

fix(routers/ovn): leftover uplink ports#2093
skrobul wants to merge 4 commits into
mainfrom
ovn-fix-multiple-uplinks

Conversation

@skrobul

@skrobul skrobul commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Fix OVN uplink LSP leak on router subnet detachment

When the last subnet is detached from a router (openstack router remove subnet), the ML2 driver is supposed to clean up the uplink it created — remove the trunk subport, delete the OVN localnet LSP, and delete the shared Neutron port. In practice the shared Neutron port and its OVN LSP (<port-uuid>) were left behind.

Root causes

Two independent bugs combined to cause the leak:

1. Cleanup anchored to a segment that no longer exists

_do_uplink_cleanup started by looking up the dynamic VLAN segment for the network. On subnet detachment, Neutron releases that segment before the ROUTER_INTERFACE AFTER_DELETE event fires. The lookup returned None and the entire cleanup was skipped, including the shared port deletion.

Fix: anchor cleanup on the shared Neutron port (uplink-<segment_id>) instead — it persists until we delete it. The segment ID is derived from the port's name, eliminating the dependency on a live segment.

2. OVO .delete() leaves the OVN LSP behind

shared_port.delete() is an OpenStack Versioned Objects call that only removes the database row. It bypasses the ML2 mechanism manager, so the OVN mechanism driver's delete_port_postcommit never runs and the port's LSP (named by port UUID) is never removed.

Fix: added delete_shared_port_lsp() which explicitly calls delete_lswitch_port(port_id, ...) on the OVN NB IDL before the OVO delete, mirroring the existing delete_uplink_port pattern for the localnet LSP.

Both fixes apply to both deletion paths: subnet detachment (ROUTER_INTERFACE AFTER_DELETE) and router deletion (PORT PRECOMMIT_DELETE).

Also included

  • Documentation of the router interface lifecycle in neutron-networking.md, covering the uplink port concept, the two OVN LSPs it produces, the creation flow, and why two event handlers are needed with different port-count thresholds.

skrobul added 4 commits June 23, 2026 18:46
Prior to this change, we have not been cleaning up `uplink-*` ports in
all of the cases. For example, they are not being removed when the
subnet is detached from the router, but we do clean it up when the
router is deleted altogether.
Explains the event-driven machinery behind router interface
attach/detach: what the uplink port is and why its VLAN tag is
leaf-local, the creation flow, why two deletion handlers are needed
(PORT PRECOMMIT_DELETE for router delete, ROUTER_INTERFACE AFTER_DELETE
for subnet detach), and why the "last port" count threshold differs
 between them.
Two bugs prevented full cleanup when the last router subnet was detached:

1. _do_uplink_cleanup depended on fetch_router_network_segment, but
   Neutron releases the dynamic VLAN segment before ROUTER_INTERFACE
   AFTER_DELETE fires. The lookup returned None and the entire cleanup
   was skipped. Fix: anchor cleanup to the shared Neutron port
   (uplink-<seg_id>) by scanning ports on the network for the uplink-
   prefix — that port survives until we delete it. Remove the now-dead
   fetch_router_network_segment and its misleading error log.

2. shared_port.delete() (OVO) only removed the DB row; the OVN LSP
   was never deleted because OVO bypasses the ML2 mechanism manager.
   Fix: add delete_shared_port_lsp() that calls delete_lswitch_port
   by port UUID, mirroring the existing delete_uplink_port pattern.

Both fixes apply to the subnet-detachment path (ROUTER_INTERFACE
AFTER_DELETE) and the router-deletion path (PORT PRECOMMIT_DELETE).
The uplink Neutron port has two OVN LSPs, not one: the localnet LSP
(uplink-<segment_id>, created explicitly by understack) and a regular
port LSP (named by port UUID, created automatically by the OVN mech
driver). Document both and add the delete_shared_port_lsp step to the
_do_uplink_cleanup flowchart, which was missing from the diagram.
@skrobul skrobul force-pushed the ovn-fix-multiple-uplinks branch from c079c96 to fffb872 Compare June 23, 2026 17:47
@skrobul skrobul marked this pull request as ready for review June 23, 2026 18:03
@skrobul skrobul requested a review from a team June 23, 2026 18:41

@stevekeay stevekeay left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but Milan should look at this because he is way more familiar with that code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants