Fix ResizableFile.write() crash when a single resize is not enough#199
Open
Generalsimus wants to merge 1 commit into
Open
Fix ResizableFile.write() crash when a single resize is not enough#199Generalsimus wants to merge 1 commit into
Generalsimus wants to merge 1 commit into
Conversation
Two bugs in ResizableFile:
1. write() used `if` instead of `while` for the resize guard.
A single doubling of the mmap may still not be large enough if the
value being written is bigger than 2× the current mmap size. The
assignment then fails with:
IndexError: mmap slice assignment is wrong size
The fix loops until the mmap is actually large enough, and also uses
max(doubled_size, offset+size) so one iteration always suffices even
for arbitrarily large values.
2. Both resize call-sites caught only SystemError as the fallback trigger
for __extand(). On Linux with Python 3, mmap.resize() raises OSError
(not SystemError) when mremap() fails. The fallback was therefore
never reached on Linux, leaving the mmap at its original size and
causing the write to crash. Fix catches both exception types.
Observed in the wild with Patroni 4.1.3 / pysyncobj 0.3.15 on Ubuntu
24.04 (Python 3.12). When a late-joining raft node receives its first
catch-up batch from the leader the batch can be several kilobytes; with
a 1 KB initial mmap a single resize to 2 KB is still not enough, hitting
bug 1. Bug 2 masks the __extand() fallback on every Linux deployment.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What breaks and why
ResizableFile.write()contains two bugs that together cause anIndexError: mmap slice assignment is wrong sizecrash on Linux when alate-joining raft node receives its first large catch-up batch from the leader.
Bug 1 —
ifinstead ofwhile(wrong size after one resize)resizeFactoris2.0, so the mmap doubles once. If the value beingwritten is larger than
2 × current_mmap_size, one doubling is notenough and the assignment crashes.
Concrete example (Patroni / pysyncobj 0.3.15, Ubuntu 24.04, Python 3.12):
The journal starts at 1 KB (
initialSize=1024). When a new node joins anexisting 3-node cluster it receives the full committed raft log as a
catch-up batch. That batch includes the initial DCS bootstrap config
(PostgreSQL parameters, pg_hba, etc.) which serialises to several KB.
The first write is at offset 40 with
size ≈ 3–5 KB. One resize bringsthe mmap to 2 KB — still too small — and the write crashes.
Bug 2 —
except SystemErrormissesOSErroron Linux / Python 3On Linux with Python 3,
mmap.resize()callsftruncate+mremap.When
mremapfails it raisesOSError, notSystemError. The__extandfallback that re-opens the file is therefore silently skippedon every Linux deployment, leaving the mmap at its original size.
Fix
if→whileso we keep resizing until the mmap is actuallylarge enough.
max(doubled_size, offset + size)so a single iteration alwayscovers arbitrarily large values without looping.
(SystemError, OSError)in both resize call-sites so the__extandfallback works on Linux.How to reproduce
Start a 3-node pysyncobj/Patroni cluster simultaneously. The third node
to join (which receives the catch-up batch rather than bootstrapping from
scratch) will log:
and then loop indefinitely unable to reach the leader, while the other
two nodes operate normally.
Observed with: pysyncobj 0.3.15, Patroni 4.1.3, Python 3.12, Ubuntu 24.04 (Linux 6.x, KVM).