fix: atomic subsystem namespace slot check via FDB transaction by boddumanohar · Pull Request #1065 · simplyblock/sbcli

boddumanohar · 2026-05-26T22:03:51Z

Problem

Parallel clone() and add_lvol_ha() requests with namespaced=True share a subsystem (NQN). All threads read the same namespace count from DB, each decides the slot is free, and each writes an lvol with the same NQN — resulting in more lvols than max_namespace_per_subsys in one subsystem.

This is a classic TOCTOU race: the check (get_next_available_subsystem_on_node) and the act (write_to_db) were not atomic.

Fix

Replace the bare lvol.write_to_db() with a new db_controller.write_lvol_with_ns_check(lvol) that wraps both the count check and the write in a single FDB transaction:

def _write_lvol_with_ns_check_tx(self, tr, node_id, nqn, max_ns, lvol_key, lvol_data):
    live = 0
    for _, v in tr.get_range_startswith(b"object/LVol/"):
        d = json.loads(v)
        if d.get("node_id") == node_id and d.get("nqn") == nqn                 and d.get("status") not in (STATUS_IN_DELETION, STATUS_DELETED):
            live += 1
    if live >= max_ns:
        return False
    tr[lvol_key] = lvol_data
    return True

FDB's OCC means: if two transactions both read the object/LVol/ range and try to commit, the one that sees a stale read loses, gets retried with fresh data, and correctly sees the slot is now taken. No explicit lock, no serialisation of unrelated requests — parallel creates on different subsystems are completely unaffected.

Callers changed

File	Location
`snapshot_controller.py`	`clone()` — namespaced clones
`lvol_controller.py`	`add_lvol_ha()` — namespaced lvol creates

Both return a retryable error ("Subsystem namespace limit reached concurrently; retry") instead of silently over-allocating when the OCC check fails.

Why not a mutex?

A per-node lock would serialise all clone requests on a node (even ones targeting different subsystems), adding ~15–25 ms of queue wait per request under parallel load. OCC only serialises the rare actual conflict.

Diff size

55 lines across 3 files.

🤖 Generated with Claude Code

Parallel clone/create requests sharing a subsystem (namespaced=True) all read the same namespace count from DB and can each decide the slot is free, resulting in more lvols than max_namespace_per_subsys being written to one NQN. Fix: replace the bare lvol.write_to_db() call with a single FDB transactional function (write_lvol_with_ns_check) that re-counts active namespaces for the target NQN inside the transaction and writes the new lvol record only when the subsystem still has room. Because the range-read (b'object/LVol/') and the write share one FDB transaction, concurrent writers that race on the same NQN trigger an OCC conflict on commit. FDB automatically retries the loser with fresh data, serialising the slot allocation without any explicit lock — parallel creates on *different* subsystems continue to run without any contention. Affected callers: - snapshot_controller.clone() (namespaced clones) - lvol_controller.add_lvol_ha() (namespaced lvol creates) Both paths now return a retryable error instead of silently over- allocating when the slot is taken after the OCC conflict is resolved. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: atomic subsystem namespace slot check via FDB transaction#1065

fix: atomic subsystem namespace slot check via FDB transaction#1065
boddumanohar wants to merge 1 commit into
mainfrom
fix/parallel-clone-subsys-namespace-race

boddumanohar commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

boddumanohar commented May 26, 2026

Problem

Fix

Callers changed

Why not a mutex?

Diff size

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant