test(jepsen): add ZSet safety workload with model-based checker#550
Open
bootjp wants to merge 32 commits into
Open
test(jepsen): add ZSet safety workload with model-based checker#550bootjp wants to merge 32 commits into
bootjp wants to merge 32 commits into
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a Jepsen workload (
elastickv-zset-safety-test) that verifies ZSet-specific safety properties under faults (network partitions, node kills), using a custom model-based checker. Goes beyond the simple add->read visibility check.Properties verified:
Concurrent-ZADD handling uses an invoke/complete windowing approach. A mutation is "committed before" a read iff its
:completeindex is strictly less than the read's:invokeindex. Mutations whose intervals overlap are "concurrent" and contribute to a per-member allowed-score set. Indeterminate (:info) mutations are treated as possibly-concurrent. ZINCRBY whose response is unknown sets:unknown-score?so the checker skips the strict score check for concurrent reads. ZREM carries the actualremoved?boolean from the server reply so a no-op ZREM does not falsely mark the member as deleted.Files
jepsen/src/elastickv/redis_zset_safety_workload.cljjepsen/test/elastickv/redis_zset_safety_workload_test.clj(test-spec construction + checker edge cases: no-op ZREM, :info ZINCRBY, deterministic score mismatch).jepsen/src/elastickv/jepsen_test.clj(entry point added)..github/workflows/jepsen-test.yml(5s smoke run on every push)..github/workflows/jepsen-test-scheduled.yml(150s default run every 6h).Running locally
The workload is invoked directly via its own
-main, not throughjepsen-test/-main:cd jepsen lein run -m elastickv.redis-zset-safety-workload \ --time-limit 60 --rate 10 --concurrency 5 \ --ports 63791,63792,63793 --host 127.0.0.1(
elastickv.jepsen-testexposeselastickv-zset-safety-testonly as a Clojure function for REPL use; CI and ad-hoc runs use the namespace's own-main.Test plan
EOF
)
Summary by CodeRabbit