Conversation
The dynamic-linker bring-up storm was the largest remaining startup band after pull request #34. Adding a per-syscall histogram pointed at the sidecar walker as the openat dominant cost (61% of getent startup), the per-call path_translation_t memset as the second source, and the opened_fd_type fstat as a small but real per-open round-trip. src/debug/syscall-hist.[ch]: opt-in histogram via ELFUSE_STARTUP_TRACE=syscalls (or =all alongside the existing step trace). Lock-free atomic counters per Linux syscall number, sorted total-ns descending in the dump. Records freeze on the first successful execve so steady-state traffic does not pollute the startup picture. Fork children disable the histogram explicitly because they resume from a parent snapshot, not a fresh bring-up. src/syscall/sidecar.c: First a per-directory absence cache keyed by (st_dev, st_ino, mtime, ctime) so the walker can skip the openat for .elfuse-sidecar-index when a recent fstat on the same dirfd already saw ENOENT. The mtime/ctime in the key closes ABA naturally and makes a cross-process index publish observable without explicit invalidation. Second a cached sysroot dirfd handed out as fcntl(F_DUPFD_CLOEXEC, 0) so each translated absolute path saves the ~30 us open(sysroot) round-trip and the dup carries CLOEXEC across any racing posix_spawn. src/syscall/path.c: drop the per-call zero-init of path_translation_t. The struct is ~12 KiB (24 metadata bytes plus three LINUX_PATH_MAX buffers) and the buffers are read-after- written by their respective resolvers. memset of all three was the dominant remaining cost after the sidecar caches. src/core/elf.c: skip the redundant memset of the file-data range in elf_map_segments. The loader previously zeroed the full page-aligned segment extent before issuing fread; now only the BSS portion plus page padding (filesz to zero_len) is zeroed. src/syscall/fs.c: skip opened_fd_type fstat when neither O_PATH nor O_DIRECTORY is set. Dynamic-linker opens are overwhelmingly regular files where the type is already implied. The corner where a guest opens a directory without O_DIRECTORY and then issues getdents now returns ENOTDIR; glibc fdopendir has required O_DIRECTORY since 2009 and the test corpus does not exercise the corner. src/core/startup-trace.h: env parsing extended to comma-separated tokens (steps, syscalls, all); legacy =1 keeps enabling steps only so existing scripts keep working. Measurement: 30-run distributions under ELFUSE_STARTUP_TRACE=syscalls, warm cache: bench-hot-guard-glibc startup syscalls: 5.225 ms baseline (single sample) -> 1.33 ms p50 (p25 1.21, p75 1.55, stdev 0.45, n=30) 3.9x bench openat per-call: 135 us baseline -> 33.4 us p50 (p25 32.4, p75 35.8, stdev 7.1, n=30) 4.0x getent passwd root startup syscalls: 7.478 ms baseline -> 2.22 ms p50 (p25 2.10, p75 2.28, stdev 0.27, n=30) 3.4x getent openat per-call: 230 us baseline -> 52.9 us p50 (p25 51.5, p75 55.1, stdev 2.2, n=30) 4.3x End-to-end wall-clock for getent: 14.6 ms p50 (p25 14.3, p75 15.1, stdev 1.18, n=30). Bench guardrail steady-state: static getpid 74 ns, clock_gettime 6.7 ns, urandom1 153 ns; dynamic-glibc getpid 53 ns, clock_gettime 6.4 ns, urandom1 142 ns. All under ceilings. The original baselines were single first-run samples; their variance band was not measured, so the speedup ratios are best-effort relative to the cited starting point. Lazy FD_REGULAR to FD_DIR promotion in sys_getdents64 was attempted but dropped after both reviewers flagged a HIGH-severity ABA hole: a sibling close+reopen between the probe and the install could land the original directory's DIR* onto a fresh regular file's slot. The fix path (fd-slot generation counter or stat+inode comparison under fd_lock) was invasive enough that the lazy promotion did not pay for its complexity.
Max042004
suggested changes
Jun 1, 2026
Collaborator
Max042004
left a comment
There was a problem hiding this comment.
Commit message describe the src/syscall/fs.c changes, but actually src/syscall/fs.c didn't appear in File changed list.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The dynamic-linker bring-up storm was the largest remaining startup band after pull request #34. Adding a per-syscall histogram pointed at the sidecar walker as the openat dominant cost (61% of getent startup), the per-call path_translation_t memset as the second source, and the opened_fd_type fstat as a small but real per-open round-trip.
src/debug/syscall-hist.[ch]: opt-in histogram via
ELFUSE_STARTUP_TRACE=syscalls (or =all alongside the existing step trace). Lock-free atomic counters per Linux syscall number, sorted total-ns descending in the dump. Records freeze on the first successful execve so steady-state traffic does not pollute the startup picture. Fork children disable the histogram explicitly because they resume from a parent snapshot, not a fresh bring-up.
src/syscall/sidecar.c: First a per-directory absence cache keyed by (st_dev, st_ino, mtime, ctime) so the walker can skip the openat for .elfuse-sidecar-index when a recent fstat on the same dirfd already saw ENOENT. The mtime/ctime in the key closes ABA naturally and makes a cross-process index publish observable without explicit invalidation. Second a cached sysroot dirfd handed out as fcntl(F_DUPFD_CLOEXEC, 0) so each translated absolute path saves the ~30 us open(sysroot) round-trip and the dup carries CLOEXEC across any racing posix_spawn.
src/syscall/path.c: drop the per-call zero-init of path_translation_t. The struct is ~12 KiB (24 metadata bytes plus three LINUX_PATH_MAX buffers) and the buffers are read-after- written by their respective resolvers. memset of all three was the dominant remaining cost after the sidecar caches.
src/core/elf.c: skip the redundant memset of the file-data range in elf_map_segments. The loader previously zeroed the full page-aligned segment extent before issuing fread; now only the BSS portion plus page padding (filesz to zero_len) is zeroed.
src/syscall/fs.c: skip opened_fd_type fstat when neither O_PATH nor O_DIRECTORY is set. Dynamic-linker opens are overwhelmingly regular files where the type is already implied. The corner where a guest opens a directory without O_DIRECTORY and then issues getdents now returns ENOTDIR; glibc fdopendir has required O_DIRECTORY since 2009 and the test corpus does not exercise the corner.
src/core/startup-trace.h: env parsing extended to comma-separated tokens (steps, syscalls, all); legacy =1 keeps enabling steps only so existing scripts keep working.
Measurement: 30-run distributions under ELFUSE_STARTUP_TRACE=syscalls, warm cache:
End-to-end wall-clock for getent: 14.6 ms p50 (p25 14.3, p75 15.1, stdev 1.18, n=30). Bench guardrail steady-state: static getpid 74 ns, clock_gettime 6.7 ns, urandom1 153 ns; dynamic-glibc getpid 53 ns, clock_gettime 6.4 ns, urandom1 142 ns. All under ceilings.
The original baselines were single first-run samples; their variance band was not measured, so the speedup ratios are best-effort relative to the cited starting point.
Lazy FD_REGULAR to FD_DIR promotion in sys_getdents64 was attempted but dropped after both reviewers flagged a HIGH-severity ABA hole: a sibling close+reopen between the probe and the install could land the original directory's DIR* onto a fresh regular file's slot. The fix path (fd-slot generation counter or stat+inode comparison under fd_lock) was invasive enough that the lazy promotion did not pay for its complexity.
Summary by cubic
Cuts dynamic-linker startup syscall cost by 3–4x with an opt-in per-syscall histogram and fast paths in the sidecar walker and path translation. Warm-cache startup drops from 5.2–7.5 ms to 1.3–2.2 ms p50; openat calls are ~4x faster.
New Features
Refactors
Written for commit f4782bb. Summary will update on new commits.