There is a single character separating a working design from a broken one. Not a missing semicolon, not a wrong bit-width — just the difference between = and <=. And the cruelest part? Your simulation will often pass either way.
Blocking and non-blocking assignments are the most misunderstood feature in Verilog and SystemVerilog. Senior engineers have taped out silicon with this bug. Understanding why they behave differently is not about memorizing a rule — it is about understanding how hardware actually exists in time.
1. Blocking (=): Sequential, Like Software
A blocking assignment executes immediately and in order. When the simulator hits =, it updates the variable on the spot before moving to the next line. This is exactly how a C variable assignment works, which is why it feels natural to software engineers.
This behavior makes = perfectly correct for combinational logic. In an always_comb block, you want a default value set first, then overridden by conditions — sequentially, top to bottom.
2. Non-Blocking (<=): Parallel, Like Hardware
A non-blocking assignment does something fundamentally different. When the simulator encounters <=, it does two separate things: first, it evaluates all the right-hand side (RHS) expressions using the current values of every signal. Then, at the very end of the time step, it applies all the updates simultaneously.
This two-phase behavior is not a quirk — it is a deliberate model of how flip-flops work in real silicon. Every register in your FPGA samples its input on the clock edge and presents the new output after propagation delay. Non-blocking assignments simulate this parallel, simultaneous update.
3. The Shift Register Trap: Where Designs Die
Here is the canonical example that has silently corrupted more student and professional designs than any other HDL mistake. A 3-stage shift register. It looks trivial. It is a trap.
When you use blocking assignments inside a clocked block, each line immediately updates the variable. By the time you reach line 3, b already holds the new value of a. So c gets the new b, which is already a. In a single clock edge, all three registers collapse to the same value.
The non-blocking version fixes this because all RHS values are frozen at the start of the clock edge. When the simulator evaluates c <= b, it uses the old value of b — not the one being scheduled. This is exactly how real flip-flops behave in silicon.
4. The Golden Rules (No Exceptions)
After decades of production silicon and millions of lines of RTL, the industry has converged on two absolute rules. Treat them as laws of physics, not guidelines:
- Use non-blocking (<=) for all assignments inside
always_ff(clocked, sequential) blocks. - Use blocking (=) for all assignments inside
always_comb(combinational) blocks. - Never mix both assignment types inside the same
alwaysblock — ever.
Final Thoughts: The Syntax Lies, the Hardware Does Not
The reason this bug is so dangerous is that HDL looks like software. Your brain pattern-matches = as “assignment” and moves on. But in hardware, there is no “sequence of operations” — there are only signals that change in time. The <= operator is not just syntax; it is the boundary between the present clock cycle and the next one.
Whenever you open an always_ff block, ask yourself: am I describing what happens this clock cycle, or what will be ready for the next one? If the answer is “next,” you need <=. That single mental shift will save you from a class of bugs that no linter catches and no waveform will show you until it is too late.
Happy coding.
fpgawizard.com

