Self-audit

Self-audit: I wrote it, I reviewed itZero blockers in mainFailures published

Let’s be real: this is a self-audit. I wrote the code and now I’m grading my own homework. In the security world, that’s a “weak claim,” and I’m not here to gaslight you. Think of this as an honest baseline: here’s exactly what I looked at, the receipts to prove it, and a list of things I haven’t touched yet. If your threat model relies on the stuff I skipped, don’t ship this to production.

The TL;DR

Main is clean: Found zero blockers or “should-fix” issues in the latest main across all three verifiers, the UMAAL asm, the Winterfell fork, the BabyBear field, and the on-device prover firmwares.
Caught 7 bugs early: I didn’t just find them; I nuked them. All 7 were fixed within 24 hours and documented in research/postmortems/. I publish my failures so you can learn from them.
ASM verified on silicon: The UMAAL Montgomery multiply is now differentially tested on the actual Cortex-M33, not just on x86. See below.
One loose end: Semaphore verifying-key freshness in CI is still an open question. See “Known limitations.”

What I actually checked

Parser safety (The “No-Panic” Rule). I hate .unwrap(). Library code has zero .unwrap(), .expect(), or panic!. If a byte slice looks at me funny, it goes through a bounds-checked helper. I enforced this with a “clippy wall” that treats these as hard errors (-D warnings). If it compiles, it didn’t use a lazy panic.

Canonical encoding. No malleability.

BN254 and BLS12 enforcement is strict. No silent-reducing. This matters because if you allow non-canonical encoding, you’re basically inviting malleability attacks into your nullifiers.
BLS12 padding is strictly validated. Any “dirty” padding bytes? Reject. Error::InvalidFp. simple as that.

Curve & Subgroup checks (No footguns).

Enforced on-curve checks for BN254 G1/G2.
For BLS12, is_on_curve and is_torsion_free are separate. Historically, BN254 precompile bugs happened because people ignored cofactors. I didn’t.

DoS protection (No infinite memory go brrr).

Hard caps on MAX_NUM_IC and MAX_PROOF_SIZE.
Found an unbounded Vec::with_capacity deep inside Winterfell’s deserializer. That’s a classic DoS vector. I patched it in our fork with a remaining_bytes cap. Read the postmortem: 2026-04-24-stark-unbounded-vec-alloc.

Hand-rolled ASM differential testing — on actual hardware. Writing ARMv8-M assembly for Montgomery multiplication is “cowboy territory” unless you can prove it’s right. The host-side test cross-checks asm against the pure-Rust reference on x86. But x86 is not the RP2350. There’s now a dedicated bench-rp2350-m33-bn-asm-test firmware that runs selftest_fq and selftest_fr on the Cortex-M33 itself: random inputs via an on-device PRNG, asm result vs reference result compared byte-for-byte, PASS/FAIL over USB serial. It matches.

Sound Allocator. The zkmcu-bump-alloc isn’t just a simple pointer. It’s been stress-tested for concurrency and alignment (1 to 64 bytes). Every unsafe block has a // SAFETY: comment because I’m not a savage.

Hardcoded Security. You can’t “downgrade” security via a proof. The MinConjecturedSecurity(95) is hardcoded in the verifier. If an attacker submits a weak proof, it fails before the engine even looks at it.

How I checked it (The Receipts)

100+ Adversarial Tests: 84 tests on Groth16, 19 on the STARK wrapper. This includes flipping every single bit in a valid proof to make sure nothing produces a false positive Ok(true). On-device prover firmwares immediately self-verify every proof they generate — “PROVE ok” always comes with a “VERIFY ok” on the same silicon.
Fuzzing to death: Ran libFuzzer on the STARK parser. 91 million executions. Zero crashes. The fuzzer tried everything; it found nothing.
Differential testing: I compared zkmcu against the “big boys”: arkworks, substrate-bn, and bls12_381. If they agree on the math, I’m happy.
CI Discipline: just check-full runs everything—lints, tests, and fuzzing—in 80 seconds. If a commit breaks a rule, it doesn’t land.

What I did NOT check (The “Don’t Sue Me” List)

This is the honesty part. These are deliberately out of scope for v0.1.0.

Constant-time execution: I haven’t done a formal CT audit. In my tests, the timing noise is in the floor (0.05%), but if you’re worried about high-end side-channel attacks on your USB/BLE transport, don’t use this for secret data yet.
Power/EM Analysis: I don’t have a ChipWhisperer lab. Yet.
Upstream Subgroup Math: I trust substrate-bn and bls12_381 for their internal subgroup checks. I haven’t audited their math, only how I call it.
VK Trust: I assume your firmware loads the VK/AIR from a trusted source (like signed flash). If an attacker can swap your VK, you have bigger problems than this verifier.

Known limitations

RP2350 only: This is the only chip I’ve measured. nRF52 or STM32 ports are “todo” items.
Fibonacci-only STARKs: Real-world STARKs (Miden, Risc0) are way heavier. This verifier is currently optimized for light workloads.
No C-ABI: It’s Rust-only for now. If you need a C-header, wait for v0.2.0.

The Learning Trail (Postmortems)

If you’re a grant reviewer asking “does this author learn from mistakes,” read any two of the postmortems. If you’re a wallet vendor asking “do they fix parser bugs before shipping,” read the two STARK postmortems. Both were caught and killed within 24 hours.

I don’t hide my bugs. I document them so they never happen again.

Reporting a vulnerability

Found a way to crash the verifier? Open a GitHub security advisory. I have a 90-day disclosure window, but I’ll probably ship a patch and credit you way sooner.