Blocking vs. Non-Blocking: The Assignment That Can Destroy Your Design

There is a single character separating a working design from a broken one. Not a missing semicolon, not a wrong bit-width — just the difference between = and <=. And the cruelest part? Your simulation will often pass either way.

Blocking and non-blocking assignments are the most misunderstood feature in Verilog and SystemVerilog. Senior engineers have taped out silicon with this bug. Understanding why they behave differently is not about memorizing a rule — it is about understanding how hardware actually exists in time.

1. Blocking (=): Sequential, Like Software

A blocking assignment executes immediately and in order. When the simulator hits =, it updates the variable on the spot before moving to the next line. This is exactly how a C variable assignment works, which is why it feels natural to software engineers.

This behavior makes = perfectly correct for combinational logic. In an always_comb block, you want a default value set first, then overridden by conditions — sequentially, top to bottom.

priority_encoder.sv
// CORRECT: blocking assignment in combinational logic module priority_encoder ( input logic [3:0] in, output logic [1:0] out ); always_comb begin out = 2’b00; // step 1: set default if (in[3]) out = 2’b11; // step 2: override if needed else if (in[2]) out = 2’b10; else if (in[1]) out = 2’b01; end endmodule

2. Non-Blocking (<=): Parallel, Like Hardware

A non-blocking assignment does something fundamentally different. When the simulator encounters <=, it does two separate things: first, it evaluates all the right-hand side (RHS) expressions using the current values of every signal. Then, at the very end of the time step, it applies all the updates simultaneously.

This two-phase behavior is not a quirk — it is a deliberate model of how flip-flops work in real silicon. Every register in your FPGA samples its input on the clock edge and presents the new output after propagation delay. Non-blocking assignments simulate this parallel, simultaneous update.

d_ff.sv
// CORRECT: non-blocking assignment in sequential logic module d_ff ( input logic clk, input logic rst, input logic d, output logic q ); always_ff @(posedge clk or posedge rst) begin if (rst) q <= 1’b0; else q <= d; // scheduled for end-of-timestep, not immediate end endmodule

3. The Shift Register Trap: Where Designs Die

Here is the canonical example that has silently corrupted more student and professional designs than any other HDL mistake. A 3-stage shift register. It looks trivial. It is a trap.

When you use blocking assignments inside a clocked block, each line immediately updates the variable. By the time you reach line 3, b already holds the new value of a. So c gets the new b, which is already a. In a single clock edge, all three registers collapse to the same value.

shift_reg_WRONG.sv
// WRONG: blocking in a clocked block — this is NOT a shift register always_ff @(posedge clk) begin b = a; // b is updated immediately to a c = b; // c gets the NEW b, which is already a d = c; // d gets the NEW c, which is already a end // Result after one clock edge: b == a, c == a, d == a // This is NOT a shift register. It is an instant copy.
Why simulation hides this: Some simulators will produce the “correct looking” waveform by accident depending on the order the always blocks are scheduled. The bug only becomes obvious when you change the order of statements — or when the synthesized netlist behaves completely differently from the simulation.

The non-blocking version fixes this because all RHS values are frozen at the start of the clock edge. When the simulator evaluates c <= b, it uses the old value of b — not the one being scheduled. This is exactly how real flip-flops behave in silicon.

shift_reg_CORRECT.sv
// CORRECT: non-blocking — a true 3-stage shift register always_ff @(posedge clk) begin b <= a; // RHS captured: a_old c <= b; // RHS captured: b_old (not the new b!) d <= c; // RHS captured: c_old end // All three updates apply simultaneously at end-of-timestep. // Result: data shifts exactly one stage per clock cycle.

4. The Golden Rules (No Exceptions)

After decades of production silicon and millions of lines of RTL, the industry has converged on two absolute rules. Treat them as laws of physics, not guidelines:

  • Use non-blocking (<=) for all assignments inside always_ff (clocked, sequential) blocks.
  • Use blocking (=) for all assignments inside always_comb (combinational) blocks.
  • Never mix both assignment types inside the same always block — ever.
golden_rules.sv
// Rule 1: always_ff → non-blocking only always_ff @(posedge clk) begin reg_a <= data_in; // correct reg_b <= reg_a; // correct end // Rule 2: always_comb → blocking only always_comb begin result = reg_b + reg_a; // correct end // Rule 3: NEVER do this always_ff @(posedge clk) begin reg_a <= data_in; // non-blocking temp = reg_a; // blocking — undefined scheduling behavior! end

Final Thoughts: The Syntax Lies, the Hardware Does Not

The reason this bug is so dangerous is that HDL looks like software. Your brain pattern-matches = as “assignment” and moves on. But in hardware, there is no “sequence of operations” — there are only signals that change in time. The <= operator is not just syntax; it is the boundary between the present clock cycle and the next one.

Whenever you open an always_ff block, ask yourself: am I describing what happens this clock cycle, or what will be ready for the next one? If the answer is “next,” you need <=. That single mental shift will save you from a class of bugs that no linter catches and no waveform will show you until it is too late.


Happy coding.
fpgawizard.com

error: Selection is disabled!