I've been analysing recurring GPU prove errors in EZKL (#882, #837) and believe a significant class of them originates from a layer below the proof system itself: IEEE-754 float non-determinism between CPU and GPU hardware.
The problem is structural. When EZKL generates a witness on an x86 CPU and attempts to verify a proof generated on an NVIDIA GPU, the float32 representations of intermediate values differ at the bit level — not because the arithmetic is wrong, but because FMA sub-LSB jitter, NaN payload variants, and signed-zero differences between hardware vendors produce divergent bit patterns for arithmetically equivalent values. The ZK circuit sees different witnesses and the proof fails.
This is not a bug in EZKL. It is a float representation problem that sits underneath any proof system operating on IEEE-754 values.
I built a protocol that solves this at the representation layer. Samayuktam/SPCMP maps every float32 through a bijective, order-preserving transformation into a canonical uint32 space, collapses all NaN payload variants to a single canonical quiet NaN, and eliminates signed-zero contamination before any proof generation occurs. The result is a bit-identical canonical representation regardless of which hardware produced the float.
Audit results (May 2026, TPU v5e hardware): 81 assertions, 0 failures. Cross-architecture: Intel x86, NVIDIA GPU, Google TPU v5e. Models tested: GPT-2 Small, BERT-base, LLaMA-style 32K vocab.
The protocol is under provisional patent. Full audit report and specifications are at: https://swapnopammitra.github.io/Pr1malFrameWork/
If this intersects with what EZKL is working on at the witness generation layer, I am reachable directly.
I've been analysing recurring GPU prove errors in EZKL (#882, #837) and believe a significant class of them originates from a layer below the proof system itself: IEEE-754 float non-determinism between CPU and GPU hardware.
The problem is structural. When EZKL generates a witness on an x86 CPU and attempts to verify a proof generated on an NVIDIA GPU, the float32 representations of intermediate values differ at the bit level — not because the arithmetic is wrong, but because FMA sub-LSB jitter, NaN payload variants, and signed-zero differences between hardware vendors produce divergent bit patterns for arithmetically equivalent values. The ZK circuit sees different witnesses and the proof fails.
This is not a bug in EZKL. It is a float representation problem that sits underneath any proof system operating on IEEE-754 values.
I built a protocol that solves this at the representation layer. Samayuktam/SPCMP maps every float32 through a bijective, order-preserving transformation into a canonical uint32 space, collapses all NaN payload variants to a single canonical quiet NaN, and eliminates signed-zero contamination before any proof generation occurs. The result is a bit-identical canonical representation regardless of which hardware produced the float.
Audit results (May 2026, TPU v5e hardware): 81 assertions, 0 failures. Cross-architecture: Intel x86, NVIDIA GPU, Google TPU v5e. Models tested: GPT-2 Small, BERT-base, LLaMA-style 32K vocab.
The protocol is under provisional patent. Full audit report and specifications are at: https://swapnopammitra.github.io/Pr1malFrameWork/
If this intersects with what EZKL is working on at the witness generation layer, I am reachable directly.