journal 2026-04-28

P06 — the first CPU, plus a real UART hanging off it

p06harden

Six projects on the ladder, half the roadmap. P06 is the first thing that deserves to be called a CPU: P05’s datapath gets a control unit, an instruction ROM, and a 4-state FSM — FETCH → DECODE → EXECUTE → WB.

What happened

The smallest defensible CPU. 5-bit PC walking a 32-entry × 16-bit ROM. 16-op encoding (4-bit opcode, 3 register fields, 8-bit imm for LDI, 5-bit absolute branch target). Branches read flags from the most-recent flag-writing instruction. HLT parks the FSM in S_HALT.

Bring-up caught two RTL bugs:

Three Yosys gotchas during synthesis:

Hardened at 100 MHz: 2659 cells, 130 × 130 µm die, +2.65 ns slow-corner setup, +0.94 ns hold, DRC=0, LVS=0, antenna=0. The 4-stage FSM split delivered as predicted: P05 needed 25 ns at slow corner; P06 ships at 10 ns with 2.65 ns of headroom. 2.5× the clock for 4× the cycles per instruction.

Then the punchline: replaced the rare-on-8-bit SAR opcode with OUT, a real CPU instruction that pushes regs[ra] out a hardware UART tx pin. P03’s UART module gets inlined as a sub-module inside top.sv, with new top-level ports baud_div and uart_tx. The CPU’s FSM stalls in WB until the byte finishes transmitting. Re-hardened at 100 MHz with the UART on board: 2333 cells (slightly fewer because SAR’s barrel shifter went away), +0.32 ns slow-corner setup. Tighter than the no-UART build but still passes.

Demo TB runs Fibonacci(7), pushes each value out via OUT, and a behavioral 8N1 receiver in the testbench samples uart_tx and prints the decoded bytes. Reads like a serial console. New annotation U calls out the UART’s 28-flop cluster along the top edge.

Subagent in parallel built a /stack page from scratch — a plain-language tour of iverilog, Yosys, OpenROAD, Magic, KLayout, Netgen, LibreLane, sky130, Nix, the gds_to_glb pipeline. Linked from the nav. (It got rebuilt completely a few hours later; see the site infrastructure entry.)

Receipts

Project page: /projects/06_fsm_cpu/.