AOT: split-compilation helper-runtime cache — experimental scaffold (#15889)#16022
AOT: split-compilation helper-runtime cache — experimental scaffold (#15889)#16022PurHur wants to merge 3 commits into
Conversation
…efault off Infrastructure for compiling helper units into separate translation units merged at link time: - script/emit-helper-runtime-object.php: discovers all 137 JitVmHelperLink units via reflection over *HELPER_PATH / *COMPILED_HELPERS constant pairs, lowers each in an isolated subprocess (96 unit objects emitted, 41 #15642-class crashers auto-skipped and left on nested-lowering fallback), writes bitcode + object + manifest per unit under build/helper-runtime-cache/<fingerprint>/. - lib/AOT/HelperRuntimeCache.php: per-script builds bind cached helpers as extern declarations with exact types read from the unit bitcode (shared LLVMContext); tracks used units. - Linker: appends used unit objects and injects -z muldefs (script object listed first so its ABI definitions win). - JitVmHelperLink::ensureCompiled: cache short-circuit before nested lowering. Status: opt-in via PHP_COMPILER_HELPER_RUNTIME_O=1 and NOT yet correct for helper-state initializers — each unit's __init__ chain (interned-constant setup) collides under muldefs and is discarded, so bound helpers can read uninitialized state (silent exit). Blocker and design for per-unit ctor renaming + llvm.global_ctors merging documented in #15889. Also sidesteps a latent php-llvm bug (createMemoryBufferWithFile references an unimported FFI class). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…and crash markers (#15889) Restructures the split-compilation cache from one global-fingerprint directory to independent per-unit entries (build/helper-runtime-cache/units/<slug>/{unit.bc,unit.o,manifest.json}): - Freshness is per unit: sha256(core fingerprint + unit source content). The core fingerprint covers only the lowering machinery, so editing one helper re-emits exactly one unit. - Crashing units get a persistent failed.json marker keyed by the same fingerprint: known-broken units are skipped instantly on every rerun and re-attempted automatically when their source or the compiler core changes. --force re-attempts everything. - Each unit commits its artifacts atomically before the next starts, so an interrupted emit resumes at the breaking part. - Consumer side reads per-unit manifests lazily (no global manifest to rebuild); binding and link-object selection unchanged. Measured (php-compiler:22.04-dev, 137 units): cold 5 m 53 s (96 emitted, 41 crash-marked) warm rerun 0.15 s (all skipped) one helper edited 3.0 s (exactly 1 re-emitted) 5 units deleted 13.8 s (only those re-processed) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Third commit: the cache is now incremental and resumable — per-unit fingerprints (core machinery + unit source content) and persistent per-unit crash markers.
Crashed units are remembered by fingerprint and re-attempted only when their source or the compiler core changes — 'resume only the breaking parts' both within a run and across compiler evolution. The remaining blocker for enabling per-script consumption is unchanged (unit init chains, #15889 item 4). |
|
Correctness matrix (fresh per-unit cache, reverted base): cache-on output == baseline == VM on every construct plain AOT supports; the earlier silent-exit was stale pre-revert cache artifacts, not the init-chain theory. Remaining gap for real speedups: (1) hello-class scripts' hot helpers (WeakRefRegistry/ErrorSilence/DirHandle/Progress) lower via the ensureStandaloneBodies corpus, not JitVmHelperLink units — coverage extension needed; (2) the 41 crash-marked units are dominated by Array*JitHelpers failing only in isolated emission. Both are #16075 step-4 (helper ABI registry) scope. |
First working scaffold for #15889 (multiple .o merged at link time), draft / default-off — see honest status below. Includes the #16010 revert as its base (branched before the revert lands on master; rebase on merge).
What works
script/emit-helper-runtime-object.php: reflection discovery of all 137 helper units (*HELPER_PATH/*COMPILED_HELPERSpairs incl. variants), isolated per-unit subprocess lowering — 96 unit objects, 160 helpers cached, 41 Regression: user-script AOT builds broken for core string builtins — preg_match capture assign, explode CRLF, str_contains, substr negative length (bin/compile.php) #15642-class crashers auto-skipped, 22.8 MB, one-time ~6 min-z muldefs(script object first → ABI state single-copy)Known blocker (why draft/off)
Each unit object carries its own
__init__chain (interned-constant setup for its helper bodies). muldefs keeps only the script's init, so cached helpers can read uninitialized state → silent exit for helper-using scripts. Fix design (in #15889): rename each unit's init to a unique symbol at emit time and register it inllvm.global_ctors, so ld keeps and libc runs all of them; helper-state init must be idempotent vs script init.Also documented in-code: latent php-llvm bug (
createMemoryBufferWithFilereferences unimportedFFI), worked around viacreateMemoryBufferWithString.Measured potential
The nested helper lowering this replaces is 47% of every build (#15889 profile); a mid-session broken-but-linking variant showed hello 3.0 s → 1.9 s and 006-project 5.3 s → 2.2 s — the win is real once init merging lands.
🤖 Generated with Claude Code