journal 2026-04-28

P05 — the ALU/datapath, hardened the third time at 40 MHz

p05harden

The first project where slack didn’t just appear. P05 is the substrate for a CPU — 8 × 8-bit register file (R0 hardwired to zero, RISC convention), a 12-op ALU (ADD/SUB/AND/OR/XOR/SHL/SHR/SAR/MOV/NOT/ADC/SBC), a 4-bit Z/N/C/V flag register, all stitched together as a 2R1W datapath. ADC and SBC read the previous flag-state’s carry, which means flags have to persist across instructions like every classical 8-bit CPU.

What happened

RTL passed first compile. Three harden runs followed.

Run one at 100 MHz: clean DRC, clean LVS, zero antenna, hold timing OK across all corners, typical-corner setup at +1.48 ns. But the slow corner (ss 100°C 1.60V) missed by -2.41 ns. Critical path was the 12-way ALU op mux, the flag-bit logic, and writeback all in one cycle. Shipped as partial to the site for honesty’s sake.

Run two: dropped the clock to 50 MHz. Still missed slow corner by -0.37 ns. Annoying — the path is genuinely long.

Run three: 40 MHz lands with +2.49 ns slow-corner setup, +1.00 ns hold, 1607 cells in the 110 × 110 µm die, DRC=0, LVS=0, antenna=0.

The lesson is clean: a single-cycle datapath with no pipelining and a deep ALU mux costs you clock speed. P06’s CPU will split the work across an FSM and crank the clock back up.

After signoff, pulled coordinates from the placed netlist and added five annotations:

The flag-register cluster is the most satisfying — four flops a hundred cells away from the regfile, doing a different job, visibly separate.

Receipts

Project page: /projects/05_alu_datapath/.