Formal Verification at the SoC Level: Why Most Teams Are Doing It Wrong

A few years back, a team I consulted for was six weeks from tape-out on a large networking SoC. Their formal verification dashboard showed 98% proof closures. Impressive. Except they had set their proof depth at 5 cycles for most properties because “the tool was timing out.” The chip came back from silicon with a coherency bug that took 11 cycles to manifest. The formal run had technically completed. The property had technically passed. And nobody had caught that the check was meaningless.

That story isn’t unusual. Across the industry, formal verification has gone from niche technique to checkbox requirement, and somewhere in that transition, many teams lost the thread of what it’s actually supposed to do. This post is about getting that thread back — specifically at the SoC level, where the complexity is highest and the cost of getting it wrong is measured in millions of dollars and six-month respins.

The Problem Nobody Talks About

Formal methods work. That’s not the debate. JasperGold, Synopsys VC Formal, Mentor Questa Formal — these are mature tools with solid mathematical foundations. The problem is adoption: teams treat formal verification as a standalone task to be “completed” rather than as an integral part of a living verification methodology. At the block level, this is manageable. You fire up the FPV app, prove your FIFO never overflows, close the properties, move on. At the SoC level, this approach collapses spectacularly.

SoC-level formal is fundamentally different in three ways. First, the state space is enormous — even with aggressive abstraction, you’re dealing with dozens of interconnected IPs, multiple clock domains, complex power states, and bus fabrics that carry implicit protocol assumptions. Second, the properties you care about most are emergent — they live at the interface between blocks, not within them. Third, the people writing the properties (verification engineers) are often working from specs written by architects who have moved on to the next project, and integration assumptions that were never written down at all.

According to a 2023 Wilson Research Group functional verification study, nearly 60% of design re-spins are caused by integration-level bugs — exactly the class of bug that block-level formal verification won’t catch and that simulation struggles to reliably exercise. SoC-level formal is positioned to close this gap. Most teams aren’t letting it.

How It Usually Goes Wrong

1. Proof Depth Set to “Good Enough”

This is the most common failure mode and the hardest to argue against in a schedule-driven environment. Bounded model checking (BMC) needs a proof depth sufficient to cover the longest observable fault propagation path in your design. For a simple register file, 10 cycles may be fine. For a PCIe controller interacting with a DMA engine through an interconnect with a 32-deep reorder buffer, 10 cycles covers approximately nothing.

When tools time out or run out of memory, the natural engineering response is to reduce the proof bound. But a bounded proof that doesn’t reach the bug distance is not a verification result — it’s a green light on a broken test. If you’re closing proofs at depth 8 and your protocol’s worst-case completion latency is 40 cycles, you’ve proven the design is correct for the first 8 cycles of operation. That’s it.

The right response to timeout is not to reduce depth — it’s to abstract, cut, or decompose. Which brings us to the next failure mode.

2. No Abstraction Strategy

SoC-level formal verification requires aggressive, deliberate abstraction. You cannot run the full netlist or even the full RTL through a monolithic formal run and expect results in a reasonable wall-clock time. You need to decide what to model, what to cut, and what to replace with a formal model (a “blackbox” or an abstract model).

Most teams either do too little abstraction (tool times out, engineers blame formal) or too much (critical behavior is abstracted away, proofs close vacuously). The discipline of abstraction — knowing which signals drive the properties you care about, and modeling only those — is a skill that takes years to develop and is rarely documented in vendor training.

A common mistake: blackboxing a memory controller to speed up proofs, then writing properties that depend on the controller’s ordering guarantees — which are now invisible to the tool. Your property proves under an unconstrained model of memory behavior that has nothing to do with reality.

3. Properties Written in Isolation from Architecture

Here’s a scenario that plays out at almost every large SoC shop: verification engineers write SVA properties for individual IPs based on the IP-level spec. Those properties are reviewed within the verification team. The architects who designed the system-level behavior are busy on the next chip. The protocol spec has seventeen versions in Confluence, and nobody is sure which one is canonical.

The result is a property suite that has excellent coverage of IP-internal behavior and essentially zero coverage of cross-IP, system-level invariants. Nobody wrote a property that says “if a TLB miss is signaled by the SMMU, the downstream DMA engine must not initiate a new transaction until the miss is resolved.” Because nobody sat down with the architect, the memory subsystem lead, and the DMA verification owner in the same room to figure out what the system-level invariants actually are.

This is an organizational problem disguised as a technical one. And it’s fixable.

4. Vacuous Passes from Missing Assumptions

Consider this property:

// Checks that grant is issued within 16 cycles of request
property arb_latency;
  @(posedge clk)
  req |-> ##[1:16] gnt;
endproperty
assert property (arb_latency);

Looks reasonable. But if req is never driven high in the formal context — because there’s no assumption constraining it — the antecedent is never true, and the property passes vacuously on every trace. JasperGold and VC Formal both have built-in vacuity detection, but you have to enable it, and you have to act on the results. A significant fraction of “passing” property suites in production environments contain vacuous proofs. Running jg -vacuity_check (or equivalent) and triaging the results is not optional — it’s hygiene.

What a Good Flow Actually Looks Like

The teams that make SoC-level formal work share a few structural practices that distinguish them from the checkbox-completers.

Start with a connectivity and control-plane audit. Before writing a single SVA property, map out your SoC’s control plane: clock enables, reset sequences, power-state transitions, interrupt routing, and arbitration boundaries. This audit — best done as a structured architecture review with both the design and verification leads — surfaces the system-level invariants that need to be verified. These become your formal property backlog.

Build a layered formal closure plan. Separate your properties into tiers: IP-internal (handled in block-level FPV), IP-boundary (interconnect protocols, handshake correctness), and system-level (multi-agent ordering, power state coherency, reset safety). Each tier has different abstraction requirements and different owners. Don’t try to prove all three in one massive formal run.

Use assume-guarantee decomposition rigorously. When splitting the design for SoC-level formal, specify explicit interface contracts between subsystems. The contract that subsystem A makes at its output becomes the assumption that subsystem B’s formal run consumes. This decomposition is only valid if the contracts are consistent — which means you need a separate (smaller) formal check that A’s assumptions are satisfiable given B’s guarantees. Skip that check, and you have proven the design correct under impossible environmental conditions.

Set proof depth based on observable latencies, not tool defaults. Document the maximum latency for each key protocol interaction (AXI outstanding transaction window, PCIe completion timeout, cache fill latency). Use these numbers to bound your proof depth requirements before you start formal runs. If the tool can’t reach that depth, you either need better hardware, better abstraction, or you scope down the property to what you can actually prove.

Track coverage, not just closure. VC Formal and JasperGold both support cover properties alongside assert properties. Use cover property statements to confirm that interesting scenarios (e.g., back-to-back requests, simultaneous resets, power state transitions under load) are actually reachable in your formal model. If your model can’t reach a scenario that definitely occurs in silicon, your environment is overconstrained and your proofs are meaningless.

Here is a simple but illustrative pattern for tracking reachability alongside your assertions:

// Reachability check — ensures the formal env can exercise this state
cover property (@(posedge clk) disable iff (!rst_n)
  (pwr_state == PWR_ACTIVE) ##1 (pwr_state == PWR_RETENTION));

// Safety property — verified only if above cover is reachable
property no_txn_in_retention;
  @(posedge clk) disable iff (!rst_n)
  (pwr_state == PWR_RETENTION) |-> !dma_req_valid;
endproperty
assert property (no_txn_in_retention);

If the cover fails (unreachable), fix your assumptions before trusting the assert.

The Veteran’s Checklist

Audit your proof depths before sign-off. Every property should have a documented justification for its bound, tied to a real protocol latency, not the tool default.
Run vacuity detection as a required step. Any property that passes vacuously must be flagged, root-caused, and either corrected or waived with justification. Zero tolerance for unreviewed vacuous passes.
Review your abstraction boundaries with the architect. Every blackboxed module needs a sign-off from someone who understands its actual behavior. “We blackboxed the PLL because it was slowing down the run” is not a valid waiver without an explicit documented assumption about what the PLL does.
Track interface contracts explicitly. For every assume-guarantee decomposition, the interface contract should be a versioned document reviewed by both the providing and consuming block owners.
Assign system-level property ownership. Each system-level invariant should have a named owner — not “the verification team.” The person who owns that property is responsible for its correctness, depth, and coverage completeness.
Run cover properties before assert properties. Confirm your formal environment is exercising the right scenarios before you trust the safety results. Cover-first is faster and catches bad assumptions early.
Include reset and power-state sequences in your formal environment. These are the scenarios most likely to be abstracted away and most likely to harbor real bugs. If your formal runs always start from a known-good steady state, you’re leaving a major class of integration bugs on the table.

The Bottom Line

Formal verification at the SoC level is not hard because the math is hard. It’s hard because it requires coordination across architecture, design, and verification teams over a sustained period — and most schedules don’t budget for that coordination. The teams that do it well have made a deliberate organizational choice to treat system-level formal as first-class verification work, not a bolt-on activity that runs in parallel with simulation and produces a dashboard nobody reads.

The teams that are doing it wrong will find out at first silicon, usually at 3 AM, staring at a logic analyzer trace that shows a sequence the simulation regress ran a million times and never once hit. The formal tool would have found it in forty minutes — if the property had been written, the depth set correctly, and the environment constrained to reflect reality. That’s the gap. Closing it is less about tools and more about process, ownership, and the willingness to do the unglamorous work of writing good assumptions before you trust your proofs.