Skip to content

[IP] Fixed heap bufer overflow and stack overflows#8

Open
matgla wants to merge 22 commits into
mobfrom
heapOverflowBug
Open

[IP] Fixed heap bufer overflow and stack overflows#8
matgla wants to merge 22 commits into
mobfrom
heapOverflowBug

Conversation

@matgla

@matgla matgla commented Jun 26, 2026

Copy link
Copy Markdown
Owner

Main problem was bad codegen ignoring if (a > b) conditions

matgla and others added 22 commits June 26, 2026 15:30
…with bad codegen ignoring if (a > b) conditions
The sparse downloader fell back to fetching the remote's default branch
(origin HEAD = current gcc master) whenever the partial+sparse fetch of the
pinned SHA didn't take. On CI that silently tested an ever-advancing gcc and
failed on brand-new upstream tests that didn't exist when the submodule was
pinned. Always end at the pinned commit instead:

- drop the default-branch fallback; error out if the pin can't be resolved
- add full_fetch(): a non-sparse but still *pinned* fetch that talks to the
  remote directly, so it works even though the submodule has `update = none`
  (which makes `git submodule update` skip it)

Also fetch the handful of out-of-tree files the torture tests #include
(gcc.dg/, gcc.target/) which the gcc.c-torture sparse path omits, so the
on-device/QEMU compile no longer fails with "include file not found".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pin gcc-testsuite at a fixed recent commit so CI is reproducible instead of
drifting. The bump adds 18 torture tests; handle the ones this tcc can't pass:

- skip the *-builtin-issignaling-1 family via should_skip_gcc_test: they need
  the unimplemented __builtin_issignaling (and __bf16/_Float16/_Float128 types)
- xfail pr125291: a confirmed codegen miscompile (wrong result at every -O
  level), not an unsupported feature, so a future fix surfaces as an XPASS

The remaining new tests (pr122000, pr124358, six compile PRs) and all 78
modified existing tests pass under QEMU.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On failure, tee the full make-test log and emit a JUnit report, then bundle
for download: the log, the JUnit report, the built cross compiler + runtime
(armv8m-tcc / armv8m-libtcc1.a / config.mak), and only the work dirs of the
tests that actually failed (mapped from JUnit to pytest's tmp-dir names, so
the ~13k-case suite isn't uploaded wholesale). Lets CI-only miscompiles be
reproduced locally.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Snapshot the working tree as a baseline before implementing -mfloat-abi=hard
VFP codegen.  Mixed contents; the substantive correctness changes this session:

Correctness fixes:
- dead_loop: clamp the body bound to the forward exit target so it stops
  NOP'ing a post-loop compare and leaving a flag-less branch (980602-1)
- SCCP: widen a CONST loop-header phi to BOTTOM when an operand on an
  executable edge is still TOP at fixpoint (990527-1, was folding the
  loop-carried value to its latch constant)
- regalloc: loop_phi_locked guard so a body temp can't evict the interval
  that carries the induction variable's register (mibench_bitcount -O2)
- opt_constprop: symref_const_prop ASSIGN-redef now falls through to
  invalidation (PASS_COVERAGE #17) + unit test
- tccgen: reclaim leaked inner-VLA token streams (abstract/function-pointer
  declarators) and leftover global-scope label-difference fixups
  (pr77754-*, 20001116/920928/930326/pr50565) -> gcc-torture O1/O2 ASAN-clean

Loop optimizer:
- loop unroll/rotation remain disabled, now with accurate root-cause comments:
  the Finding #15 O2 miscompiles (seeds 18/23/37) are a BACKEND
  scratch-register-clobbers-loop-carried-value bug the passes merely expose
  (rotation's output IR verified correct via -dump-ir), not a transform bug.
  Re-enable once the regalloc scratch / per-instruction-liveness fix lands.

Tests / infra (WIP checkpoint):
- xfail test_fp_hard_float_uses_vfp (Phase 4 gap -- the next task)
- new unit tests (ra_*, ir_*, thop_*), asm fixtures, source-coverage tooling
- docs cleanup + docs/plan_vfp_hard_float.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Most of them fixed, but new harness is almost prepared.
A packed-bitfield byte store whose field straddles a byte boundary was
miscompiled at -O1/-O2 into a WORD store that clobbered the adjacent
array element (bitfield fuzz seed 30: arr12[0] cfd68b64 -> cf000000).

Root cause is a store-width-source asymmetry: a plain STORE takes its
width from the dest (lvalue) btype, but STORE_INDEXED / STORE_POSTINC
take it from the value (src1) btype. const_prop value forwarding (needs
both const_prop + const_prop_tmp) collapses the field read-modify-write
and replaces the store's value operand with a wider INT32 temp; harmless
while it is a plain STORE, but when a later pass converts it to
STORE_INDEXED the byte store becomes a 4-byte store.

Fix:
- tccir.h: tcc_ir_set_src1 never lets a value rewrite widen a
  STORE_INDEXED / STORE_POSTINC value operand from INT8/INT16 to INT32.
- ir/opt.c: new pass tcc_ir_opt_narrow_store_value_btype clamps a narrow
  plain STORE's value btype to its access (dest) width so any later
  plain->STORE_INDEXED conversion carries the correct width; run early
  (before any conversion) and again before regalloc.

Resolves bitfield fuzz seeds 5/30/34/48/68/84/176 (same root cause).
Regression test tests/ir_tests/232_fuzz_bitfield_store_indexed_width.c.
Verified: ir_tests 1808 pass; diff_olevels 0-300 zero divergences.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pre-existing completed work (fuzz seed 148, bitfield profile): a plain STORE
did not invalidate known-bits slots that a narrower store overlaps, dropping a
byte. Adds regression test 233 and triage_olevels.sh sweep tweaks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fuzz seed 102 (switch profile), -O2-only miscompile. The jump-table dispatch
(tcc_gen_machine_switch_table_mop) uses R_IP (R12) as a fixed scratch for the
LSL/ADD/LDR/ADD/BX table-base preamble and clobbers it. At -O2 csmix is inlined,
raising register pressure enough that the linear-scan allocator placed the
loop-carried checksum cs in R12 with a live range spanning the dispatch into the
case-body targets — so every case read the clobbered cs and the checksum
diverged. -O0/-O1 keep cs in a callee-saved register (csmix not inlined), so the
bug was -O2-only.

Fix: ra_build_intervals marks any interval live across a SWITCH_TABLE/SWITCH_LOAD
as crosses_call. R12 is caller-saved, so this forces the value into a
callee-saved register (or spill) — exactly what the -O1 allocator already does.
Resolves 11 of the 13 known switch-profile divergent seeds (589 and 822 are
separate root causes). Regression test 234.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant