Skip to content

Update bug metadata crash lines#2119

Closed
linkeLi0421 wants to merge 59 commits into
google:masterfrom
linkeLi0421:update-bug-metadata-crash-lines
Closed

Update bug metadata crash lines#2119
linkeLi0421 wants to merge 59 commits into
google:masterfrom
linkeLi0421:update-bug-metadata-crash-lines

Conversation

@linkeLi0421
Copy link
Copy Markdown

No description provided.

linke and others added 30 commits April 8, 2026 13:37
- Add opensc_transplant_fuzz_pkcs15_reader benchmark
- Add c-blosc2_transplant_decompress_frame_fuzzer benchmark
- Fix local experiment dispatcher and measurer for local runs
- Update requirements and Makefile for generated benchmarks
Dispatch-gated transplant benchmark for nDPI with 38 bugs from OSV.
Includes build.sh, combined/harness patches, bug_metadata, dispatch-prefixed
seeds, and per-bug crash references. Updates the example command in
benchmarks/experiment_config.yaml to reference this benchmark.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Some historical benchmark images use an older glibc, so overwriting the
image's Python 3 with base-image's newer binaries made python3 unusable
during builds. Keep the parent image's Python and copy only the PyYAML
package (needed by fuzzers/utils.py) from base-image.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- aflplusplus: fall back to GCC-plugin mode when AFL++ is built without
  LLVM instrumentation (afl-llvm-pass.so missing).
- coverage: install wget/unzip explicitly; prebuilt image no longer
  ships them.
- honggfuzz: define _HF_LINUX_NO_BFD to build without libbfd/libunwind.
- libafl: remove stale Rust wrappers, install current nightly, and
  symlink cargo/rustc/rustup into /usr/local/bin so PATH overrides
  aren't needed.
- symcc_aflplusplus: libstdc++-5-dev is gone from apt; use 10-dev.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Drop the xenial-based ndpi-merge image and build directly on
gcr.io/oss-fuzz-base/base-builder@sha256:87ca1e9e... (the canonical focal
digest used by 29 of 32 FuzzBench benchmarks). Clone nDPI in the
Dockerfile and vendor libpcap-1.9.1.tar.gz (sha256
635237637c5b619bcceba91900666b64d56ecb7be63f298f601ec786ce087094) so the
build is reproducible without the merge container.

All 35 OSV crash files were regenerated against the focal build; 2 bugs
picked up shifted crash locations, 33 reproduce identically. Unlocks
modern fuzzers that require LLVM 17 / C++17 (libafl, hastefuzz) and
fixes the 15-minute cycle-1 exit we saw for aflplusplus/honggfuzz on
the xenial base.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Same rebase as the ndpi change: drop the xenial opensc-merge image and
build directly on gcr.io/oss-fuzz-base/base-builder@sha256:87ca1e9e...
(focal). Clone OpenSC in the Dockerfile — no vendored tarballs needed
since OpenSC builds entirely from its own source tree via bootstrap +
autoconf.

All 23 OSV crash files were regenerated against the focal build; every
crash reproduced at the exact same crash_file:crash_line as on xenial,
so no bug_metadata entries shifted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New transplant benchmark with 18 OSV bugs, dispatch-gated harness.
- Dockerfile sets HOME=/root, removes stale OSS-Fuzz /rust cargo shim,
  and disables UBSan halt so shallow UB does not drown the crash dir.
- build.sh: borrow avc_dec_fuzzer_seed_corpus.zip (500-capped) so svc
  decoder gets real H.264 structural seeds with 9 dispatch variants,
  strips -fsanitize=undefined / -fno-sanitize-recover from CFLAGS so
  the two shallow UB sites in ih264d_parse_cavlc stop drowning out
  transplanted bugs.
- Bump libafl pinned nightly to 2025-08-15 (select_unpredictable and
  stdarch_x86_avx512 now stabilized; fixes LibAFL cargo build).
The parent oss-fuzz base-builder ships a rustup install at /rust. Rustup's
update then tried to rename files between /rust/rustup/toolchains and
/rust/rustup/tmp, which sit on different overlay layers, and failed with
"Invalid cross-device link (os error 18)". Fix by wiping /rust completely
and pinning RUSTUP_HOME/CARGO_HOME to /root/.* so the new install has a
single writable root, and symlink the binaries into /usr/local/bin so
subsequent RUN steps see them on PATH without relying on $HOME quirks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…$CXX

Two fixes to the libpcap + nDPI build step:

1. Disable optional libpcap backends (dbus/bluetooth/rdma/libnl/dpdk/
   dag/septel/snf) so libpcap.a doesn't pick up transitive symbols from
   whatever -dev packages the current fuzzer builder image happens to
   install (e.g. libglib2.0-dev -> libdbus-1-dev in aflplusplus's
   builder). Without this, libpcap.a has undefined refs to dbus_* and
   bluetooth_*, and nDPI's configure link test for pcap_open_live
   silently fails, leaving no fuzz/Makefile and no binary produced.

2. Propagate \$CC and \$CXX into libpcap's configure and nDPI's make.
   The fuzzbench fuzzer wrappers (libafl_cc, afl-clang-fast, etc.) set
   CC/CXX to their instrumenting compilers, and without explicit
   forwarding libpcap's configure probes gcc directly, producing an
   uninstrumented libpcap.a. Most fuzz time runs inside libpcap, so
   that's a large uncovered region in the edge-count map. LibAFL's
   calibration was rejecting every seed for this reason.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts:
#	fuzzers/libafl/builder.Dockerfile
43 transplanted ndpi bugs (29 local, 14 code-diff, 2-byte dispatch) built
against the focal base-builder digest. build.sh enables non-recoverable
UBSan for shift/divide-by-zero/null/bounds/vla-bound/return/unreachable
so shallow UB bugs register as crashes; overflow classes excluded to
avoid nDPI's sort.c init-time sub-overflow.
The runner image (base-runner) doesn't ship libjson-c.so.4, so the
dynamically-linked fuzz_process_packet binary exited 127 on every
invocation inside the trial containers, producing zero fuzzing work.
nDPI's fuzz targets don't need json-c, so --disable-json-c makes the
binary self-contained.
Persistent-mode fuzzers (AFL++ with cmplog) previously crashed every
trial at dry-run because the harness kept a global ndpi_flow that was
reused across calls. Certain transplanted bugs caused nDPI to free
ndpi_flow internally, so the next iteration's memset(ndpi_flow, …)
tripped a heap-use-after-free and subsequent seeds crashed.

Allocate ndpi_flow via ndpi_flow_malloc at the start of each
LLVMFuzzerTestOneInput call and release it with ndpi_free_flow before
returning, matching the ndpi_transplant_fuzz_ndpi_reader pattern. The
dispatch-gated ndpi_free_flow that combined.diff previously added for
OSV-2020-342 is now redundant and was removed.

OSV-2023-102's crash signature was literally that cross-call UAF, so it
no longer triggers — the bug was a harness artifact rather than an
nDPI defect, and regen_crashes confirms 42/43 transplanted bugs still
trigger cleanly.
Heavy transplant targets (ghostscript) race honggfuzz's fork-follow
ptrace-attach when a persistent-mode child is SIGKILLed on the default
1s deadline, fatally aborting the fuzzer. Raise --timeout to 25s and
wrap the invocation in a retry loop so coverage carries forward across
the remaining races (corpus/crashdir on disk survive intact).
Heavy transplant targets (ghostscript) race honggfuzz's fork-follow
ptrace-attach when a persistent-mode child is SIGKILLed on the default
1s deadline, fatally aborting the fuzzer. Raise --timeout to 25s and
wrap the invocation in a retry loop so coverage carries forward across
the remaining races (corpus/crashdir on disk survive intact).
honggfuzz: raise per-input timeout and wrap in restart loop
Self-contained FuzzBench benchmark generated from the ghostscript
bug-transplant merge. Bundles Dockerfile, build.sh, combined + harness
patches, harness sources, dispatch-prefixed seeds, and bug_metadata.json
for coverage-based triage.
Add ghostscript_transplant_gstoraster_fuzzer benchmark
Replace the fuzzer.py retry wrapper with a source patch in
builder.Dockerfile that downgrades the LOG_F("Couldn't attach to pid")
abort in linux/arch.c to a warning with early return. The persistent-
mode ptrace race no longer kills honggfuzz, so the wrapper loop isn't
needed.
# Conflicts:
#	fuzzers/honggfuzz/fuzzer.py
Adds benchmarks/ntopng_transplant_fuzz_dissect_packet/ (fuzz_dissect_packet
at ntopng b7b2810e, 18 OSV bugs transplanted, dispatch-gated). Dockerfile
pins the focal gcr.io/oss-fuzz-base/base-builder digest per the section-10
rebase recipe; nDPI pinned to 86b56646 (2023-05-20, two days before the
ntopng target commit). libpcap is vendored; zeromq / json-c / libmaxminddb
come from GitHub release tarballs via ADD.

Also turns off LSAN experiment-wide in common/sanitizer.py (detect_leaks:
1 -> 0). ntopng leaks ~33KB in HostPools::HostPools() at every process
start, so the measurer's forced ASAN_OPTIONS=detect_leaks=1 was overriding
the binary's __lsan_default_options() and failing every coverage snapshot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ntopng_transplant_fuzz_dissect_packet: new transplant benchmark
- Regenerated Dockerfile/build.sh/benchmark.yaml and combined/harness
  patches from the latest transplant merge output.
- Replaced the original OSV testcases in seeds/ with dispatch-prefixed
  `-patched` variants so the harness accepts them.
- Added crashes/ with per-bug crash logs captured from the official
  merged build (used for downstream triage + side-effect analysis).
- Dropped the stale nested c-blosc2_transplant_decompress_frame_fuzzer/
  duplicate tree, the monitor/ subdir, and patches/bug_canary.{c,h};
  they are superseded by the current generator layout.
Transplanted 31 libredwg bugs (ASAN) onto target commit a67ea97d with
dispatch-gated patches, plus 34 local bugs at target. Built atop the
bug-merge container so the build environment matches the merge-time
verification.

Notes:
- Dockerfile sets WORKDIR=/src (not /src/libredwg) so AFL's
  `ar r /libAFL.a *.o` builder step does not bundle stale project
  object files into libAFL.a.
- Stale *.o files under /src are deleted in the Dockerfile for the
  same reason.
- build.sh drops the redundant `cd libredwg` from the original OSS-Fuzz
  build because the script already cd's to /src/libredwg at the top to
  apply transplant patches.
…hmark

Add libredwg_transplant_llvmfuzz benchmark
Generated from the transplant merge output for htslib, fuzz target
hts_open_fuzzer. Includes Dockerfile/build.sh/benchmark.yaml, combined
and harness patches, dispatch-prefixed seeds, and per-bug crash logs.
linkeLi0421 and others added 27 commits April 22, 2026 09:32
ntopng's fuzz_dissect_packet links against libexpat.so.1 (pulled in
transitively). With clang's default --as-needed the reference gets
elided and the other fuzzers' binaries don't carry it as DT_NEEDED, so
libexpat1 in the runner image was never required. afl-clang-fast drops
--as-needed, so the aflplusplus binary keeps libexpat.so.1 in DT_NEEDED
and the runner dies with "Fork server handshake failed" because the
dynamic loader can't resolve it.

Adding libexpat1 to the baseline benchmark-runner image fixes aflplusplus
without affecting other fuzzers and is cheap (~100KB).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
benchmark-runner: install libexpat1 for ntopng (and any AFL++ target)
The bug transplant for ntopng b7b2810e was re-run, producing a much
simpler merged patch. Only OSV-2023-688 (HostPools redis-failure
alloc/dealloc mismatch) retains a code edit; the remaining 17 OSV bugs
now trigger testcase-only at the target commit with no dispatch gating.

- patches/{combined,harness}.diff replaced from new merge output.
- seeds/testcase-*-patched refreshed (OSV-2023-688 gets dispatch 0x01;
  the other 17 use 0x00).
- bug_metadata.json regenerated — OSV-2023-688 dispatch_value=1, the
  other 17 are 0. Crash lines for the 5 re-transplanted bugs come from
  fresh transplant_crash logs; the 13 testcase-only bugs keep their
  previously derived metadata.
- crashes/OSV-2023-{688,1352,1360,1375,1381}.txt refreshed.
- build.sh gains a post-patch Python snippet that injects
  __lsan_default_options into fuzz_dissect_packet.cpp. The new merge's
  harness.diff dropped the ntopng-specific LSAN override, but FuzzBench's
  coverage measurer still needs it or every snapshot exits non-zero.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replay every PoC inside the compiled benchmark image and save full ASAN
output. Three bugs (OSV-2022-654, OSV-2023-1099, OSV-2023-316) that
previously appeared non-reproducing were masked by ASAN
detect_stack_use_after_return=1 and trigger cleanly under UAR-off, so
the replay now tries both variants with the 10x10 retry cadence used at
merge verification. Refreshes crash_file/crash_line in bug_metadata.json
for 63 bugs (up from 31).
Regenerate ghostscript_transplant_gstoraster_fuzzer from the latest
offline-merge output (merge_offline_ghostscript_2be8b436, 89 bugs,
3 dispatch bytes). bug_metadata.json now carries crash_file/crash_line
for 87/89 bugs (OSV-2022-339 PoC times out, OSV-2022-232 no longer
crashes on the merged binary). build.sh dispatch-prefix list extended
from 2 to 3 bytes. crashes/ snapshots retained for triage.
Port the same slim-runner adaptations the pdfwrite transplant benchmark
uses. Without these the FuzzBench runner/coverage images (which do not
ship libfontconfig.so.1) abort on binary load, so no cov_summary.json
is ever produced and the dispatcher reschedules each trial at cycle 0
forever. libafl additionally fails to build because autogen's
-fno-sanitize-recover=shift,signed-integer-overflow turns ghostpdl's
pre-existing shift UB in base/std.h into a hard compile-time abort.

- Dockerfile: drop libfontconfig1-dev from apt install.
- build.sh: --enable-fontconfig -> --disable-fontconfig, drop the
  pkg-config --libs fontconfig link step, and strip
  -fno-sanitize-recover=... from every generated Makefile before make.
libafl's builder pulls in LLVM-17, whose clang elevates
implicit-function-declaration to an error (C99 tightening shipped in
clang-16+). That killed ghostpdl's own code at e.g.
base/gxstroke.c:341 -> 'gx_dc_is_pattern2_color' on every libafl
rebuild. The other fuzzers build on OSS-Fuzz base-builder's clang-15
which still treats it as a warning.

Setting CFLAGS inline on the CUPS ./configure line didn't help, since
it doesn't persist past that command. Export the flag (and the CXX
equivalent) at the top of the CUPS section so it stays live through
pushd/popd into the ghostpdl compile, matching what
gs_device_pdfwrite_fuzzer/build.sh already does.
ghostpdl's startup path fires recoverable UBSAN "left shift of negative
value" in base/gsiorom.c, base/gxpath.c, base/gspath.c, psi/imainarg.c,
and base/gsalloc.c member-access-through-null on every input. With the
sanitizer default of exitcode=1 on any UBSAN report, afl++'s calibration
sees exit=1 and marks every seed as crashing, triggering
"We need at least one valid input seed that does not crash!" and
aborting all 10 afl++ trials.

The binary itself works — 984 of 996 corpus seeds exit 0 standalone with
ASAN_OPTIONS=detect_leaks=0. The fix is just runtime env: tell UBSAN to
keep going and return 0, and tell afl++ to skip any residual crashing
seed during calibration instead of aborting on all of them.
afl++ pins ASAN_OPTIONS=abort_on_error=1:handle_abort=2 at runtime when
spawning the target (overriding the Dockerfile ENV), so every UBSAN
error becomes SIGABRT and every seed gets marked "crash, skipping" —
all 1147 imported seeds in trial-11..20, yielding the same PROGRAM ABORT
("We need at least one valid input seed that does not crash!") we saw
in ghostscript-24h-3 and ghostscript-24h-6.

The triggering UB is pre-existing and benign:
  base/std.h:66        #define min_int (-1 << (ARCH_SIZEOF_INT * 8 - 1))
  base/gxpath.c:114-117, base/gspath.c:604-607, base/gxcpath.c:530-533
  psi/imainarg.c:692, base/gsiorom.c:94   (same left-shift-of-negative)
  base/gsalloc.c:1627,1630                (null-pointer-member-access)

These fire on ghostpdl startup for any PDF input. Adding
-fno-sanitize=shift,null to CFLAGS/CXXFLAGS suppresses only those two
UBSAN classes; ASAN heap/stack/UAF instrumentation for the transplanted
bugs is unaffected, so bug detection still works.

Verified standalone: all dispatch-prefixed seeds exit 0 against the
non-afl++ binaries. The afl++ binary was the unique victim because
afl++ forces abort_on_error at runtime.
ntopng_transplant_fuzz_dissect_packet: regenerate from new merge output
The generator captured an empty/stub crash log for OSV-2022-339 because
its `-runs=10` replay hit the 30s timeout before ghostpdl's garbage
collector got enough allocator churn to trip the bug. The testcase is
bytewise identical to the original OSS-Fuzz PoC (dispatch_value=0, this
is a local bug — no transplant patch needed).

Rerun with -runs=100 reproduces the crash in ~30s:
  AddressSanitizer: heap-use-after-free
  READ of size 2 at 0x62a00031bcb0
    #0 0xdef27b in gc_trace   /src/ghostpdl/./psi/igc.c:915:17
    #1 0xde9b3c in gs_gc_reclaim /src/ghostpdl/./psi/igc.c:338
    #2 0xd10d24 in ireclaim      /src/ghostpdl/./psi/ireclaim.c:80
    ...

Replace the truncated crashes/OSV-2022-339.txt with the full ASan
report and update bug_metadata.json to add crash_file / crash_line /
crash_function so triage can map coverage back to this bug.

Now 88/89 bugs have crash lines. OSV-2022-232 still doesn't reproduce
at -runs=100 — tracked separately.
Same pattern as OSV-2022-339: generator ran `-runs=100` with a 180s
timeout, which wasn't enough allocator churn for this testcase-only
bug to manifest. Its offending memcpy only overruns once ghostpdl's
freelist hits a specific layout, which in the merged binary needs
roughly 1000 replay iterations to reproduce (the merged benchmark
binary has one of the transplant patches in base/fapi_ft.c,
OSV-2022-456's dispatch-gated branch, which shifts codegen just
enough to push the vulnerable allocator path further into the run).

Reran at -runs=1000 and got the expected heap-buffer-overflow:
  AddressSanitizer: heap-buffer-overflow
  READ of size 19589104 at 0x62a00036b050
    #0 __interceptor_memcpy
    #1 pdfi_fapi_get_glyph   /src/ghostpdl/./pdf/pdf_fapi.c:1187:21
    #2 get_fapi_glyph_data   /src/ghostpdl/./base/fapi_ft.c:427
    ...

Replace the empty crashes/OSV-2022-232.txt with the full ASan report
and update bug_metadata.json to add the crash file/line/function.

Now 89/89 bugs have crash lines.
…ogen

The earlier CFLAGS="-fno-sanitize=shift,null" export was ineffective:
ghostpdl's autogen.sh re-assembles the Makefile CFLAGS and re-injects
-fsanitize=undefined *after* our negations, so UBSAN still instruments
null-pointer-member-access in base/gsalloc.c:1627/1630/1648/1783,
psi/interp.c:1123/1401, psi/igcstr.c:294/325, psi/zbfont.c:55 etc.
Every PDF input then emits dozens of UBSAN reports; afl++ pins
ASAN_OPTIONS=abort_on_error=1 at runtime, which turns each report
into SIGABRT, so calibration labels all 1137 imported seeds as
"results in a crash, skipping" and aborts with
"We need at least one valid input seed that does not crash!".
Reproduced across ghostscript-24h-{3,6,7}.

Fix: after autogen, strip -fsanitize=undefined (any variant) from
every generated Makefile so UBSAN instrumentation is dropped entirely.
ASAN is unaffected, so transplanted heap/stack/UAF bug detection stays
intact — which is all we actually need for triage.
The previous attempt matched the umbrella `-fsanitize=undefined` form
but ghostpdl's autogen actually emits the explicit per-check list:

  -fsanitize=array-bounds,bool,builtin,enum,float-divide-by-zero,
             function,integer-divide-by-zero,null,object-size,return,
             returns-nonnull-attribute,shift,signed-integer-overflow,
             unreachable,vla-bound,vptr

so the earlier sed matched nothing and afl++ kept aborting on every
seed with "member access within null pointer" in base/gsalloc.c etc.
(Verified by peeking into the freshly-built aflplusplus builder image:
each Makefile CUPSCFLAGS / GCFLAGS / DBUS_CFLAGS lines start with the
expanded list, plus a trailing `-fno-sanitize=shift,null` — but clang
doesn't suppress the null check on the earlier -fsanitize=...null...
via the later -fno-sanitize=null in this form, so null-deref UBSAN
still fires.)

Strip the whole list by anchoring on `-fsanitize=array-bounds` and
taking any non-space run. `-fsanitize=address` is emitted separately
so ASAN instrumentation is unaffected.
Adds a pristine-source `original-crashes/` directory to each of the 10
transplant benchmarks, with per-bug crash logs captured against the
un-transplanted source tree (migration bugs at their buggy source
commit, local/already-triggering bugs at the benchmark target commit),
each using the era-matched OSS-Fuzz base-runner selected via
get_base_runner_for_date().

Per benchmark under benchmarks/<proj>_transplant_*/original-crashes/:
  - OSV-*.txt     : one real ASAN/UBSAN/libFuzzer crash log per bug
  - collect_crash_builds.csv : target/source-commit -> oss_fuzz_commit map
  - COMMANDS.md   : per-bug classification (local vs migration), source
                    commits, re-run loop

Parity achieved across all 10 benchmarks (crashes/ == original-crashes/):
  c-blosc2_transplant_decompress_frame_fuzzer       28
  ghostscript_transplant_gs_device_pdfwrite_fuzzer  26
  ghostscript_transplant_gstoraster_fuzzer          89
  htslib_transplant_hts_open_fuzzer                 11
  libavc_transplant_svc_dec_fuzzer                  18
  libredwg_transplant_llvmfuzz                      63
  ndpi_transplant_fuzz_ndpi_reader                  35
  ndpi_transplant_fuzz_process_packet               43
  ntopng_transplant_fuzz_dissect_packet             18
  opensc_transplant_fuzz_pkcs15_reader              23

Also fixes:

 * 9 benchmark Dockerfiles: add `ENV ADDITIONAL_ARGS="-rss_limit_mb=8192"`
   so libFuzzer's default 2048MB rss cap does not silently mask bugs
   whose crash path needs >2GB (e.g. c-blosc2 OSV-2021-464).

 * ndpi_transplant_fuzz_process_packet/Dockerfile: change
   UBSAN_OPTIONS from `halt_on_error=1` to `halt_on_error=0` so the
   ambient `ndpi_utils.c:218` shift UB (hit by every testcase via
   starcraft protocol detection) prints but does not abort before the
   intended bug fires.

 * ndpi_transplant_fuzz_process_packet/build.sh: split UB_CLASSES into
   `UB_CLASSES_INSTRUMENT` (full list) and `UB_CLASSES_NORECOVER` (same
   list minus `shift`) — keeps UBSAN-as-crash behavior for real UB
   classes while letting the starcraft shadow UB pass through.

 * ndpi_transplant_fuzz_process_packet/crashes/*.txt: regenerated against
   the rebuilt merged binary. 42/43 now terminate on the intended per-bug
   ASAN crash; formerly 16 ended on the starcraft shadow UB.

 * opensc_transplant_fuzz_pkcs15_reader/crashes/*.txt: regenerated against
   a freshly built merged binary; 4 files that used to end on UBSAN
   shadow (`pkcs15-tcos.c:142/150/235`, `card-openpgp.c:618`) now report
   the real ASAN stack-buffer-overflow / stack-use-after-return at the
   same source lines.

 * Removed redundant nested directory
   `benchmarks/opensc_transplant_fuzz_pkcs15_reader/opensc_transplant_fuzz_pkcs15_reader/`
   (older-generation copy of the same benchmark with no crashes/ dir).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add original-crashes for 10 transplant benchmarks + fixes
Updates `bug_metadata.json` crash_file / crash_line / crash_function for
17 bugs whose recorded crash location no longer matched the actual first
non-harness frame in the benchmark's `crashes/<bug>.txt`.

ndpi_transplant_fuzz_process_packet (16 bugs): the previous metadata
pointed at the shadow UB at `ndpi_utils.c:218` (`ndpi_net_match`) that
was firing before the real target bug when `UBSAN_OPTIONS=halt_on_error=1`
and `-fno-sanitize-recover=shift` were in effect. With those fixed
(earlier commit in transplant-original-crashes PR), the real target bug
fires and this commit aligns the metadata to match the new crashes/:

  OSV-2020-342  -> kerberos.c:467      ndpi_search_kerberos_original
  OSV-2020-774  -> kerberos.c:423      ndpi_search_kerberos_original
  OSV-2022-445  -> tls.c:361           getTLScertificate
  OSV-2020-59   -> ndpi_main.c:4061    check_ndpi_tcp_flow_func
  OSV-2020-194  -> yahoo.c:78          check_ymsg
  OSV-2020-972  -> irc.c:679           ndpi_search_irc_tcp
  (and 10 more, see diff)

opensc_transplant_fuzz_pkcs15_reader (1 bug): OSV-2020-1981 was pointing
at a harness function (`fuzz_reader_transmit` in
`tests/fuzzing/fuzz_pkcs15_reader.c`); after re-capturing `crashes/`
against the merged binary, the first non-harness frame is the real
`sc_single_transmit` in `libopensc/apdu.c:379`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-rss_limit_mb=8192

Both bugs allocate ~2GB en route to the real memory-safety error. libFuzzer's
default 2048MB per-alloc cap aborted the process with `out-of-memory` before
the bug could fire, so the prior crashes/<bug>.txt logs only recorded the OOM
stack and bug_metadata.json pointed at the OOM frame (my_malloc:175 for
c-blosc2, ks_resize:142 for htslib) rather than the actual crash site.

Re-collected via `regen_crashes.py --bug ... -rss_limit_mb=8192` against the
freshly-compiled crash-line-collector:<project>-compiled image (built from
the benchmark Dockerfile + sudo compile). The new logs / metadata point at:

* c-blosc2 OSV-2021-464: heap-buffer-overflow blosc_read_header @ blosc2.c:680
* htslib   OSV-2020-999: SEGV vcf_parse_format @ vcf.c:2396

These match the upstream OSV bug reports.
…22-511 + ndpi OSV-2020-242

Both bugs had original-crashes/<bug>.txt files that did not actually capture
the underlying memory-safety bug, so the rq3 validity classifier could not
build a usable signature for comparison.

c-blosc2 OSV-2022-511:
* Old original was at commit 95e0fd42 and surfaced only as
  `SUMMARY: AddressSanitizer: 72 byte(s) leaked in 1 allocation(s)`
  (LeakSanitizer report — no exploit-relevant stack).
* Re-collected at commit abb0faba (the actual buggy commit per the OSS-Fuzz
  upstream record) and got the real bug:
  `heap-buffer-overflow ZSTD_initDDict_internal @ zstd_ddict.c:134:9`.
* Updated COMMANDS.md (mapping + heredoc) and collect_crash_builds.csv to
  reflect the corrected commit.

ndpi OSV-2020-242:
* Old original captured the libFuzzer "deadly signal" path through the
  harness's `assert(ndpi_load_domain_suffixes(...) >= 0)` — no ASan signature.
* Re-collected at commit 98d9f524 with `--runner-image auto` and got the
  intended bug: `heap-buffer-overflow ndpi_workflow_process_packet @
  reader_util.c:1457:13`. Same function as the post-transplant
  crashes/<bug>.txt log fires, only with natural line drift across commits.

Both regens were done with `fuzz_helper.py collect_crash` against an
era-matched base-runner. Audit confirms no other `original-crashes/<bug>.txt`
file in the project has a `libFuzzer: deadly signal`-only SUMMARY.
Regenerated via fuzzbench_generate.py against the merge output at
/mnt/nas/linke/new_migrate/c-blosc2/decompress_frame_fuzzer/bug_transplant/
merge_offline_c-blosc2_79e921d9 after fixing three latent bugs:

* fuzzbench_generate.py build.sh template: harness_sources snapshot
  restoration + harness.diff were double-applying the harness file
  ("does not match index"). Now passes --exclude=<path> to git apply
  for any path the snapshot covers.

* bug_transplant_merge_offline.py _DIFF_INCLUDES: only matched
  top-level CMakeLists.txt, so blosc/CMakeLists.txt (where
  __bug_dispatch.c is wired into the library SOURCES) and
  tests/fuzz/CMakeLists.txt (target_sources + target_include_directories
  for the fuzzer target) were silently dropped from combined.diff /
  harness.diff. Without those, the fresh checkout+rebuild failed with
  undefined reference to __bug_dispatch. Added */CMakeLists.txt and
  **/CMakeLists.txt to the pathspec.

* bug_verify.py: now passes -rss_limit_mb=8192 so OSV-2021-464
  (heap-buffer-overflow READ 16 only reachable after a ~2.3 GB malloc)
  doesn't false-negative as libFuzzer OOM during the final-verify.

bug_metadata.json now records 29 / 29 triggered with crash_file + line
for every bug. original-crashes/ preserved (restored from HEAD; new
generation didn't reproduce them).
Four updates that bring the audit from 708 / 709 sanitizer-SUMMARY logs
to 712 / 712:

* c-blosc2 OSV-2021-464 crashes/: prior file was libFuzzer's OOM
  exit, not a sanitizer report, because the in-image PoC replay used
  libFuzzer's 2 GB default RSS cap (the bug only fires after a
  ~2.3 GB malloc). Recaptured the real heap-buffer-overflow READ 16
  in __asan_memcpy with -rss_limit_mb=8192 against the rebuilt
  benchmark image.

* c-blosc2 OSV-2021-897 original-crashes/: testcase-only transplant
  added in the recent merge regen; the previous batch of original-
  crashes didn't include it. Captured against the NAS prebuilt
  c-blosc2 binary at the introduced commit d1ea514286 — heap-buffer-
  overflow READ in ZSTD_copyRawBlock at zstd_decompress.c:772.

* libredwg OSV-2021-495 / OSV-2023-416 original-crashes/: previously
  absent because these two bugs are misclassified by the migration
  pipeline as already-triggering at target but do not reproduce on
  the merged binary. Captured at each bug's introduced commit via
  fuzz_helper.py collect_crash --runner-image auto with two new
  entries in collect_crash_builds.csv (oss_fuzz_commit=auto). Yields
  the canonical double-free in bit_chain_free (bits.c:3015) and the
  expected SEGV inside libc for OSV-2023-416.

* libredwg COMMANDS.md rewords the "missing originals" note now that
  the originals are recorded; the underlying "merged binary doesn't
  trigger these two" caveat remains.
The migration pipeline flagged both bugs as "already-triggering at target"
(CSV cell 0.5|0) but neither PoC actually crashes on the merged benchmark
binary, nor on the unpatched target a67ea97d (both exit cleanly in <50 ms).
The crashes captured in original-crashes/ are at their introduced commits
(749923e4 for the double-free in bit_chain_free, a954ad92 for the SEGV
inside libc); the merged binary's intervening changes closed both paths.

A 2026-05-26 transplant retry timed out for OSV-2021-495 and hit the
ChatGPT account's usage limit for OSV-2023-416. Rather than burn more
budget on a second retry with uncertain odds, dropping the two entries:

- bug_metadata.json: remove the 2 entries from bugs{}, set total_bugs
  65 -> 63, and record the previous entries + the drop reason under a
  new dropped_bugs field.
- original-crashes/COMMANDS.md: reword the note to say the originals
  are retained as historical references rather than backing live
  benchmark bugs.
- original-crashes/OSV-2021-495.txt and OSV-2023-416.txt: kept on
  disk as informational artifacts.
Follow-up to the bug_metadata.json drop in e61b98a. The two bugs no
longer trigger anywhere relevant to this benchmark (verified clean
exit at unpatched target a67ea97d and on the merged binary), and two
transplant attempts produced no usable diff. Removing the orphan
artifacts:

- original-crashes/OSV-2021-495.txt and OSV-2023-416.txt
- seeds/testcase-OSV-2021-495-patched and testcase-OSV-2023-416-patched
- 2 rows in collect_crash_builds.csv (introduced commits 749923e4 and
  a954ad92)
- "historical references" paragraph in COMMANDS.md replaced with a
  one-paragraph pointer to the diagnostic notes on NAS at
  /mnt/nas/linke/new_migrate/libredwg/llvmfuzz/dropped_bugs.md

The NAS notes preserve: each bug's OSV record, the misclassification
analysis (CSV cell 0.5|0 mistakenly treated as "already triggering"),
the diagnostic test results from 2026-05-26, the captured original-crashes
signatures from the introduced commits, and three angles future work
could take to bring these back.
This .bak file was created locally as a safety backup before the
65→63 bug_metadata edit, then accidentally swept into 20f4b5d via
'git add' of the whole benchmark dir. It's local-only scratch and
shouldn't be in the repo.
OSV-2023-68 was previously a wrong-bug transplant: the Apr-8 agent's
minimizer over-reduced the 2-file fix (isvcd_api.c + isvcd_parse_epslice.c)
down to a single hunk that created an infinite recursion, so the
benchmark recorded a `stack-overflow` in isvcd_populate_res_prms instead
of the real `heap-buffer-overflow READ` in isvcd_residual_samp_mb_dyadic
(isvcd_residual_resamp.c:2031). It slipped through because the Apr-8
verification predated the strict same-bug oracle.

Re-transplanted with gpt-5.4 (the first retry was sabotaged by a 95%-full
disk that truncated the agent's files mid-run; a second retry on a clean
disk succeeded in ~16 min). Post-agent verification confirmed the
heap-buffer-overflow, and the re-merge wrapped/verified all 18 bugs
(18/18 triggering). Regenerated the FuzzBench benchmark from the updated
merge output.

Strict-oracle validation of the regenerated benchmark: 16 exact + 2
partial (sanitizer-class drift), 0 rejected. OSV-2023-68 is now EXACT
(16 shared funcs / 8 files vs the original reference).

- bug_metadata.json: OSV-2023-68 crash_file/line now
  isvcd_residual_resamp.c:2031 in isvcd_residual_samp_mb_dyadic
- crashes/, patches/, build.sh, Dockerfile, seeds/, benchmark.yaml
  regenerated from merge_offline_libavc_c38af025
- patches/harness_sources/: svc_dec_fuzzer.cpp snapshot (out-of-repo harness)
- original-crashes/ unchanged (reference crashes are commit-invariant)
OSV-2023-102 is a genuine merge-stage loss. The agent's transplant was
correct in isolation (SEGV in ndpi_free_flow_data, freeing the
union-poisoned 0x2020000000000000 negotiated_alpn pointer), confirmed by
its NAS transplant_crash.txt. But the merge consolidated four
softether-family bugs' struct fields (OSV-2022-661/670/709 + this one)
into one union member, shifting softether.hostname from offset 22-23 to
29-30 and destroying the byte-alias with tls_quic.negotiated_alpn
(offset 16-23). With the alias gone the bug cannot fire on the merged
binary under any dispatch configuration — empirically confirmed
(clean exit on its own bit; all-bits crashes a different softether bug).

This is the documented dispatch limitation: __bug_dispatch can gate code
but not type layout, so bugs that share a union/struct with others lose
byte-alias preconditions after merge consolidation.

The captured crashes/OSV-2023-102.txt was a secondary harness UAF
(ndpi_flow_malloc reuse across -runs=10), not the intended bug, so the
rq3 oracle correctly rejected it.

Removed bug_metadata entry (recorded under dropped_bugs), crashes/,
original-crashes/, and seed artifacts. Full diagnostic report:
/mnt/nas/linke/new_migrate/ndpi/fuzz_process_packet/dropped_bugs.md
Adds ndpi_transplant_fuzz_ndpi_reader_dispatchonly — a control variant
of the ndpi_reader benchmark at the same target commit (5cad39f0) with
the __bug_dispatch harness mechanism applied (harness.diff) but NO
transplanted bug patches (combined.diff omitted).

Purpose (RQ5 baseline): isolate the dispatch-harness cost and establish
the native-crash reference set. Fuzzing this variant and diffing its
crash signatures against the merged benchmark separates native/local
target bugs (appear here) from any merge-introduced crashes (appear only
in the merged binary). The merged ndpi_reader campaign showed ~12.5% of
crashes (16 dissector functions: ndpi_search_dns/quic/h323/... ) that do
not map to the 35 planted bugs; this control determines whether those
are native or composition artifacts. Also yields exec/sec and coverage
overhead vs the merged binary.

build.sh: identical to the merged benchmark minus the combined.diff
apply block (and the c-blosc2-specific CMakeLists fixup). Verified it
builds under libfuzzer and links __bug_dispatch; runs clean on input
(no planted crashes). No bug_metadata.json / crashes/ — there are no
planted bugs in a control variant.
@google-cla
Copy link
Copy Markdown

google-cla Bot commented Jun 3, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@linkeLi0421 linkeLi0421 closed this Jun 3, 2026
@linkeLi0421 linkeLi0421 deleted the update-bug-metadata-crash-lines branch June 3, 2026 02:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants