The Witness Problem: When BigInt Precision Breaks Your Proof

The witness and the circuit computed different numbers. Both were 'correct.' Here's why that's terrifying — and the three bugs we found that prove it.

The witness and the circuit computed different numbers. Both were "correct." Here's why that's terrifying — and the bugs we found that prove it.

What Is the Witness Problem?

In a ZK proof system, the witness is the prover's secret knowledge. The circuit is the set of constraints that verify the computation. They must agree on every intermediate value.

But here's the thing: the witness is computed by regular Rust code (BigInt arithmetic, floating-point conversions, integer division). The circuit operates in a 254-bit prime field (modular arithmetic, field inversion, no truncation). These are fundamentally different computational models.

When they disagree — even by a single bit in a single intermediate value — the proof is invalid. The prover can't generate a valid proof because the witness values don't satisfy the circuit's constraints.

We found three bugs in our witness generator that illustrate exactly how this happens.

Bug 1: The Precision Time Bomb

Our original to_scaled function was simple:

fn to_scaled(v: f64) -> BigInt {
    BigInt::from((v * 1e30) as i128)
}

This looks fine. Multiply by 10^30, cast to integer, wrap in BigInt. And it works perfectly for small values.

But f64 has a 53-bit mantissa — roughly 16 decimal digits of precision. When you multiply a typical physical value by 10^30, the result needs 30+ decimal digits. The cast to i128 captures whatever f64 can represent, and silently drops the rest.

For the surface tension parameter (σ = 0.0728 N/m):

f64:    0.0728 * 1e30 = 72800000000000004194304  (only 16 significant digits)
BigInt: 0.0728 * 1e30 = 72800000000000000000000000000  (exact)

That's a relative error of ~10^-16 — negligible for most applications, catastrophic for a ZK proof. The circuit computes with the exact field element. The witness provides a different value. The constraint a * b - c * S = 0 fails by a tiny amount. The proof is invalid.

The fix splits the conversion into two stages, each within f64's precision:

fn float_to_scaled_big(v: f64) -> BigInt {
    let int_part = v.trunc() as u64;
    let frac_part = v - int_part as f64;
    let int_scaled = BigInt::from(int_part) * BigInt::from(10u64).pow(30);
    let frac_hi = (frac_part * 1e15).round() as i64;
    let frac_scaled = BigInt::from(frac_hi) * BigInt::from(10u64).pow(15);
    int_scaled + frac_scaled
}

The integer part gets exact scaling (multiplication by 10^30 in BigInt — no precision loss). The fractional part gets 15 digits (within f64's capacity) and is scaled by 10^15 in BigInt. Combined: 30 digits of precision, matching the circuit exactly.

Bug 2: The Silent Clamp

The witness generator had a safety clamp to prevent the bubble radius from going negative:

// BEFORE (BUGGY)
let r_min = BigInt::from(1u64) * scale() / BigInt::from(1_000_000u64);
if r_curr < r_min {
    r_curr = r_min.clone();
}

The intent was reasonable: if numerical instability drives the radius below a minimum, clamp it instead of producing nonsensical physics.

The problem: the circuit has no such clamp. The circuit computes R_next = 2R - R_prev + R''×dt² faithfully, even if the result is smaller than r_min. When the witness clamps but the circuit doesn't, they diverge. The proof fails.

Worse: the clamp was silent. No error, no warning. The witness just quietly produced different numbers than the circuit expected.

The fix replaces the clamp with an assertion:

// AFTER (CORRECT)
assert!(r_curr > BigInt::zero(),
    "Radius went non-positive at step {} — reduce timestep or check parameters",
    step);

If the simulation is physically unstable, we want to know immediately — not discover it as a mysterious proof failure.

Bug 3: The Informational Trace

The witness struct returned a radius trace:

pub struct SimulationWitness {
    pub r_trace: Vec<u128>,
    // ...
}

This looked like the circuit would use r_trace[i] to set the radius at each step. But in reality, the circuit only uses r_trace[0] (the initial radius). Every subsequent radius is computed by the circuit itself through constrained field arithmetic.

The values r_trace[1..N] were only used for display purposes — showing peak temperature and minimum radius in the CLI output. But a developer reading the code would naturally assume the trace feeds into the circuit. If they modified the trace (say, smoothing it for numerical stability), the witness would be wrong and the proof would fail — for no obvious reason.

The fix was documentation:

/// `r_trace[0]` is the initial radius loaded into the circuit. All subsequent
/// values `r_trace[1..N]` are informational only (used for peak temperature
/// and min radius display) — the circuit computes its own trace via
/// constrained field arithmetic starting from `r_trace[0]`.
pub struct SimulationWitness {
    pub r_trace: Vec<u128>,

This isn't a code change. It's a correctness boundary change. Making explicit which values the circuit trusts and which are advisory prevents an entire class of future bugs.

Field vs. Integer: The Deepest Subtlety

Even after fixing precision and clamping, there's a fundamental gap between the witness and the circuit.

The witness computes division as integer truncation:

c = floor(a * b / S)

The circuit computes division as field inversion:

c = a * b * S⁻¹ mod p

These give different results whenever a * b is not exactly divisible by S. The integer version truncates. The field version gives the exact multiplicative inverse — a completely different number.

For public inputs, this matters: the values the verifier checks must be computed the same way the circuit computes them. Our compute_public_inputs<Fr>() function replays the entire simulation using field arithmetic:

pub fn compute_public_inputs<F: PrimeField>(witness: &SimulationWitness) -> Vec<F> {
    let s = F::from_u128(SCALE);
    let s_inv = s.invert().unwrap();
    // Replay the entire circuit's computation using field ops
    // ...
}

This guarantees the public inputs match what the circuit will produce, regardless of how the BigInt witness was computed.

The Meta-Lesson

Every ZK project has a witness problem. The specific bugs differ, but the pattern is always the same:

  1. Two computational models (witness generator vs. circuit) must agree exactly
  2. Silent divergence is the default — nothing tells you they disagree until proof generation fails
  3. Precision mismatches compound across thousands of steps
  4. Defensive programming (clamps, defaults, fallbacks) actively fights against correctness

The fix is always the same: make the witness generator compute the exact same operations as the circuit, using the same arithmetic. Where that's impossible (because the circuit uses field arithmetic and the witness uses integers), bridge the gap explicitly with field-replay functions.

And document the boundary. Make it impossible to confuse which values feed into the circuit and which are informational. The bugs that kill ZK projects aren't in the math — they're in the assumptions about which code produces which values.


This is Part 5 of our 8-part series. Part 4: 1,088 Bytes to Prove a Star covers the proof pipeline. Next: Part 6: Testing ZK Privacy — how to verify that proofs actually hide what they should.

New here? Subscribe free for the full series.

Subscribe to Jacobian

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe