A 12-op ALU with an 8-register file and a flag register
(Z N C V) — externally the chip still looks like a
combinational box driven by an “instruction” each cycle, but the regs
and flags hold their values across cycles. This is the building block
project 06 will bolt a state machine on top of to make a tiny CPU.
Clock target dropped from P02/P04’s 100 MHz to 40 MHz (25 ns period). 1,607 cells; 68 flops handle the regfile + ALU result staging
- flag register. Max-slew has warnings in the extreme corner only.
The first attempt at this project targeted 100 MHz like P02/P04 and missed the slow PVT corner setup by −2.41 ns (the critical path is read-port mux → 12-way ALU op mux → flag-bit logic → writeback in one cycle, ~20 ns at slow corner). 50 MHz still missed by −0.37 ns. 40 MHz lands cleanly with +2.49 ns of slack at the slow corner. P06 will reintroduce pipelining and crank the clock back up — when you split decode and execute into separate cycles, each cycle’s combinational path is half as long.
Architecture
What’s new vs. P04
- Register file.
regs[0..7]of 8-bit each. R0 is hardcoded to zero (reads as 0, writes ignored) — same convention every RISC uses. - Flag register.
Z N C Vcaptured into a 4-bit register whenflag_weis high.ADC/SBCread C from this register, so the flags persist across instructions. - Two read ports + one write port. The
ra/rb/rdtriple is the standard “2R1W” shape every CPU microarchitecture is built on. - Signed vs unsigned shifts.
SHRzero-extends,SARsign-extends.Vflag is set on signed overflow for arithmetic ops.
Op encoding
op | mnemonic | result |
|---|---|---|
0000 | ADD | rd = a + b |
0001 | SUB | rd = a - b |
0010 | AND | rd = a & b |
0011 | OR | rd = a | b |
0100 | XOR | rd = a ^ b |
0101 | SHL | rd = a << 1 |
0110 | SHR | rd = a >> 1 (zero-extend) |
0111 | SAR | rd = a >>> 1 (sign-extend) |
1000 | MOV | rd = a (passthrough) |
1001 | NOT | rd = ~a |
1010 | ADC | rd = a + b + carry-in |
1011 | SBC | rd = a - b - carry-in |
b is regs[rb] when use_imm = 0, or imm[7:0] when use_imm = 1.
RTL
// Project 05: tiny ALU / datapath.
//
// One step up from project 01's pure-comb ALU: this one has a *register
// file* and *flag register*, so the ALU result is something you can
// store and feed back into the next operation. Externally it still
// looks like a combinational box driven by an "instruction" each cycle
// (no FSM, no fetch, no PC) — but the regs and flags hold their values
// across cycles, which is what every CPU and most peripherals are
// built out of.
//
// Architecture:
//
// ┌─────────────┐ ra ┌──────────┐ a ┌──────┐ result ┌────┐
// │ │────────▶│ regfile │─────▶│ │─────────▶│ │
// │ inputs │ rb │ R0..R7 │ b │ ALU │ │ rd │
// │ │────────▶│ 8 × 8b │─────▶│ │ flags │ we │
// │ (op, ra, │ imm └──────────┘ └──────┘ ┌────▶│ │
// │ rb, rd, │ ──── (mux on use_imm) ───── │ └────┘
// │ imm, we) │ │
// └─────────────┘ flag_we
// ▼
// ┌──────────┐
// │ flags │
// │ Z N C V │
// └──────────┘
//
// Register file:
// - 8 registers × 8 bits. R0 is hardcoded to zero (reads as 0; writes
// are silently ignored). RISC convention; very useful as a "throw
// away" destination and as the "0" operand for MOV-via-ADD.
// - Asynchronous read on `ra` and `rb`; synchronous write to `rd`
// when `we` is high.
//
// ALU ops (4-bit `op`):
// 0000 ADD rd = a + b
// 0001 SUB rd = a - b
// 0010 AND rd = a & b
// 0011 OR rd = a | b
// 0100 XOR rd = a ^ b
// 0101 SHL rd = a << 1
// 0110 SHR rd = a >> 1 (logical)
// 0111 SAR rd = $signed(a) >>> 1
// 1000 MOV rd = a (passthrough, ignores b)
// 1001 NOT rd = ~a
// 1010 ADC rd = a + b + carry-in (carry from current flags reg)
// 1011 SBC rd = a - b - carry-in
// 1100..1111 reserved — treat as MOV
//
// Flags:
// Z result == 0
// N result[7] (sign bit, two's-complement)
// C carry-out (ADD/ADC) or borrow-out (SUB/SBC) ; 0 for logical ops
// V signed overflow on ADD/ADC/SUB/SBC ; 0 for logical ops
//
// The flags register only updates when `flag_we` is high. ADC/SBC
// reads the C flag from the *current* register value, before the
// pending update — i.e., the flags reg holds the value from the last
// flag-writing instruction. (This matches every classical 8-bit CPU.)
//
// Observability:
// `obs` mirrors register `ra` — set ra=N to peek register N's value
// without affecting anything. ra is a pure read port, no side effects.
//
// What this project teaches that the earlier ones didn't:
// - A **register file** (the heart of every CPU's microarchitecture).
// - **Flag-register state** that lives across cycles.
// - Multiple read ports + one write port — the standard "2R1W" shape.
// - Explicit handling of *signed* vs *unsigned* arithmetic (SAR, V).
`default_nettype none
module top (
input logic clk,
input logic rst_n,
// ---- "instruction" inputs (drive these each cycle) ----
input logic [3:0] op,
input logic [2:0] ra, // read port A address
input logic [2:0] rb, // read port B address
input logic [2:0] rd, // write port address
input logic [7:0] imm, // immediate (alternative to register B)
input logic use_imm, // 1 = use imm as B operand, 0 = use reg[rb]
input logic we, // write-enable for register file
input logic flag_we, // capture the four flags into the flag register
// ---- live outputs ----
output logic [7:0] result, // combinational ALU result
output logic [7:0] obs, // mirrors register[ra] (peek any reg)
output logic [3:0] flags // {Z, N, C, V} — registered
);
// ---- register file ----
// Eight 8-bit registers. R0 is treated as a constant zero — reads
// bypass the storage, writes are dropped. We declare storage for
// all eight to keep the indexing straightforward.
logic [7:0] regs [0:7];
wire [7:0] a_data = (ra == 3'd0) ? 8'h00 : regs[ra];
wire [7:0] b_reg = (rb == 3'd0) ? 8'h00 : regs[rb];
wire [7:0] b_data = use_imm ? imm : b_reg;
// ---- flag register ----
// {Z, N, C, V}. Updated when flag_we is high.
logic [3:0] flags_q;
// ---- ALU (combinational) ----
// We compute the full set of candidate results in parallel and mux
// by `op`. ADD/SUB widen to 9 bits so the carry/borrow falls out as
// bit 8 — same trick project 01 used for its 4-bit add.
wire [8:0] add_w = {1'b0, a_data} + {1'b0, b_data};
wire [8:0] sub_w = {1'b0, a_data} - {1'b0, b_data};
wire cin = flags_q[1]; // C bit of the current flag reg
wire [8:0] adc_w = {1'b0, a_data} + {1'b0, b_data} + {8'h00, cin};
wire [8:0] sbc_w = {1'b0, a_data} - {1'b0, b_data} - {8'h00, cin};
// Carry/overflow per op. C and V are 0 for non-arithmetic ops.
logic [7:0] alu_y;
logic c_out;
logic v_out;
always_comb begin
alu_y = 8'h00;
c_out = 1'b0;
v_out = 1'b0;
unique case (op)
4'b0000: begin // ADD
alu_y = add_w[7:0];
c_out = add_w[8];
v_out = (a_data[7] == b_data[7]) && (alu_y[7] != a_data[7]);
end
4'b0001: begin // SUB
alu_y = sub_w[7:0];
c_out = sub_w[8]; // borrow-out (1 = borrowed)
v_out = (a_data[7] != b_data[7]) && (alu_y[7] != a_data[7]);
end
4'b0010: alu_y = a_data & b_data; // AND
4'b0011: alu_y = a_data | b_data; // OR
4'b0100: alu_y = a_data ^ b_data; // XOR
4'b0101: begin // SHL
alu_y = {a_data[6:0], 1'b0};
c_out = a_data[7];
end
4'b0110: begin // SHR (logical)
alu_y = {1'b0, a_data[7:1]};
c_out = a_data[0];
end
4'b0111: begin // SAR (arithmetic right)
alu_y = {a_data[7], a_data[7:1]};
c_out = a_data[0];
end
4'b1000: alu_y = a_data; // MOV (passthrough A)
4'b1001: alu_y = ~a_data; // NOT
4'b1010: begin // ADC
alu_y = adc_w[7:0];
c_out = adc_w[8];
v_out = (a_data[7] == b_data[7]) && (alu_y[7] != a_data[7]);
end
4'b1011: begin // SBC
alu_y = sbc_w[7:0];
c_out = sbc_w[8];
v_out = (a_data[7] != b_data[7]) && (alu_y[7] != a_data[7]);
end
default: alu_y = a_data; // reserved → MOV
endcase
end
wire z_out = (alu_y == 8'h00);
wire n_out = alu_y[7];
// ---- sequential storage ----
integer i;
always_ff @(posedge clk or negedge rst_n) begin
if (!rst_n) begin
for (i = 0; i < 8; i = i + 1) regs[i] <= 8'h00;
flags_q <= 4'h0;
end else begin
if (we && rd != 3'd0) regs[rd] <= alu_y;
if (flag_we) flags_q <= {z_out, n_out, c_out, v_out};
end
end
// ---- outputs ----
assign result = alu_y;
assign obs = a_data;
assign flags = flags_q;
endmodule
`default_nettype wire Testbench
The verifying TB exercises every op in both register-register and
register-immediate forms, checks flag bits on representative cases
(carry-out, signed overflow, zero, negative), and verifies the gating
behavior of we / flag_we.
// Project 05 testbench — verifying TB for the tiny ALU/datapath.
//
// We drive an "instruction" each cycle — set op/ra/rb/rd/imm/use_imm/
// we/flag_we, advance the clock, then check `result`, `obs`, `flags`,
// and (if a write occurred) the destination register on the next
// cycle. Each ALU op is exercised with both register-register and
// register-immediate forms; flag bits are checked on a representative
// case for each arithmetic op.
//
// Style: a small set of helper tasks pretend we have an instruction
// set, then a series of test_*() tasks each cover one operation. The
// reference values are computed inline in plain SV with the same
// widening trick the DUT uses, so bugs in the reference are caught by
// disagreement with the explicit constants in the asserts.
`timescale 1ns/1ps
`default_nettype none
module tb;
// ---- 100 MHz chip clock ----
logic clk = 0;
always #5 clk = ~clk;
// ---- DUT I/O ----
logic rst_n;
logic [3:0] op;
logic [2:0] ra;
logic [2:0] rb;
logic [2:0] rd;
logic [7:0] imm;
logic use_imm;
logic we;
logic flag_we;
logic [7:0] result;
logic [7:0] obs;
logic [3:0] flags;
top dut (
.clk (clk),
.rst_n (rst_n),
.op (op),
.ra (ra),
.rb (rb),
.rd (rd),
.imm (imm),
.use_imm (use_imm),
.we (we),
.flag_we (flag_we),
.result (result),
.obs (obs),
.flags (flags)
);
// ALU op constants for readability.
localparam logic [3:0] OP_ADD = 4'b0000;
localparam logic [3:0] OP_SUB = 4'b0001;
localparam logic [3:0] OP_AND = 4'b0010;
localparam logic [3:0] OP_OR = 4'b0011;
localparam logic [3:0] OP_XOR = 4'b0100;
localparam logic [3:0] OP_SHL = 4'b0101;
localparam logic [3:0] OP_SHR = 4'b0110;
localparam logic [3:0] OP_SAR = 4'b0111;
localparam logic [3:0] OP_MOV = 4'b1000;
localparam logic [3:0] OP_NOT = 4'b1001;
localparam logic [3:0] OP_ADC = 4'b1010;
localparam logic [3:0] OP_SBC = 4'b1011;
int errors = 0;
// ---- helpers --------------------------------------------------------
// One-cycle "step": present inputs, settle through the next clk edge,
// hold for half a cycle so combinational outputs are stable for checking.
task automatic step(
input logic [3:0] op_i,
input logic [2:0] ra_i,
input logic [2:0] rb_i,
input logic [2:0] rd_i,
input logic [7:0] imm_i,
input logic use_imm_i,
input logic we_i,
input logic flag_we_i
);
begin
@(negedge clk);
op = op_i;
ra = ra_i;
rb = rb_i;
rd = rd_i;
imm = imm_i;
use_imm = use_imm_i;
we = we_i;
flag_we = flag_we_i;
@(posedge clk);
// small settle so result/flags reflect the post-edge state
#1;
end
endtask
// Loadi: rd = imm. Implemented as ADD R0 + imm (R0 reads as zero, so
// the result is just imm). Asserts we; doesn't touch flags.
// Note: using MOV here wouldn't work — MOV is a-passthrough, not b.
task automatic loadi(input logic [2:0] rd_i, input logic [7:0] val);
step(OP_ADD, 3'd0, 3'd0, rd_i, val, 1'b1, 1'b1, 1'b0);
endtask
// Peek register N onto `obs` without writing or computing flags.
task automatic peek(input logic [2:0] r);
step(OP_MOV, r, 3'd0, 3'd0, 8'h00, 1'b0, 1'b0, 1'b0);
endtask
// Issue an ALU op writing rd (and optionally updating flags), with
// the b operand coming from register rb_i.
task automatic alu_rr(
input logic [3:0] op_i,
input logic [2:0] rd_i,
input logic [2:0] ra_i,
input logic [2:0] rb_i,
input logic cap_flags
);
step(op_i, ra_i, rb_i, rd_i, 8'h00, 1'b0, 1'b1, cap_flags);
endtask
// Same, but with an immediate b operand.
task automatic alu_ri(
input logic [3:0] op_i,
input logic [2:0] rd_i,
input logic [2:0] ra_i,
input logic [7:0] imm_i,
input logic cap_flags
);
step(op_i, ra_i, 3'd0, rd_i, imm_i, 1'b1, 1'b1, cap_flags);
endtask
// Read register r and assert its value matches expected. Uses peek()
// to put it on `obs`, then checks. Doesn't disturb anything else.
task automatic check_reg(input logic [2:0] r, input logic [7:0] exp,
input string label);
begin
peek(r);
if (obs !== exp) begin
$display("FAIL [%s] R%0d: got 0x%02h, expected 0x%02h",
label, r, obs, exp);
errors = errors + 1;
end
end
endtask
task automatic check_flags(input logic [3:0] exp, input string label);
begin
if (flags !== exp) begin
$display("FAIL [%s] flags: got %b (ZNCV), expected %b",
label, flags, exp);
errors = errors + 1;
end
end
endtask
// ---- the actual tests ----------------------------------------------
initial begin
$dumpfile("tb.vcd");
$dumpvars(0, tb);
// sane defaults
op = OP_MOV;
ra = 3'd0;
rb = 3'd0;
rd = 3'd0;
imm = 8'h00;
use_imm = 1'b0;
we = 1'b0;
flag_we = 1'b0;
rst_n = 1'b0;
// Hold reset for a few cycles, then release.
repeat (3) @(posedge clk);
@(negedge clk); rst_n = 1'b1;
// After reset, every register should be zero.
check_reg(3'd1, 8'h00, "post-reset R1");
check_reg(3'd7, 8'h00, "post-reset R7");
check_flags(4'b0000, "post-reset flags");
// R0 must always read zero, even after a write.
loadi(3'd0, 8'hFF);
check_reg(3'd0, 8'h00, "R0 stays zero after write");
// ---- LOADI / MOV ----
loadi(3'd1, 8'h0F);
loadi(3'd2, 8'h11);
loadi(3'd3, 8'h80);
check_reg(3'd1, 8'h0F, "loadi R1");
check_reg(3'd2, 8'h11, "loadi R2");
check_reg(3'd3, 8'h80, "loadi R3");
// ---- ADD ----
// R4 = R1 + R2 = 0x0F + 0x11 = 0x20
alu_rr(OP_ADD, 3'd4, 3'd1, 3'd2, 1'b1);
check_reg(3'd4, 8'h20, "ADD R4=R1+R2");
// Z=0 N=0 C=0 V=0
check_flags(4'b0000, "ADD R4 flags");
// ADD with carry-out: 0xFF + 0x01 = 0x100 → result 0x00, C=1, Z=1
loadi(3'd5, 8'hFF);
alu_ri(OP_ADD, 3'd6, 3'd5, 8'h01, 1'b1);
check_reg(3'd6, 8'h00, "ADD R6=R5+1 wraps");
check_flags(4'b1010, "ADD R5+1 flags Z C"); // {Z N C V} = {1,0,1,0}
// Signed overflow: 0x7F + 0x01 = 0x80, V=1, N=1
loadi(3'd5, 8'h7F);
alu_ri(OP_ADD, 3'd6, 3'd5, 8'h01, 1'b1);
check_reg(3'd6, 8'h80, "ADD 0x7F+1 = 0x80");
check_flags(4'b0101, "ADD 0x7F+1 flags N V"); // Z=0 N=1 C=0 V=1
// ---- SUB ----
// R4 = R2 - R1 = 0x11 - 0x0F = 0x02
loadi(3'd1, 8'h0F);
loadi(3'd2, 8'h11);
alu_rr(OP_SUB, 3'd4, 3'd2, 3'd1, 1'b1);
check_reg(3'd4, 8'h02, "SUB R4=R2-R1");
check_flags(4'b0000, "SUB no-borrow flags");
// SUB with borrow: 0x00 - 0x01 = 0xFF, C=1 (borrow), N=1
alu_ri(OP_SUB, 3'd4, 3'd0, 8'h01, 1'b1);
check_reg(3'd4, 8'hFF, "SUB R4=0-1 wraps to 0xFF");
check_flags(4'b0110, "SUB 0-1 flags N C"); // Z=0 N=1 C=1 V=0
// SUB equal: result == 0, Z=1
loadi(3'd1, 8'h42);
alu_rr(OP_SUB, 3'd4, 3'd1, 3'd1, 1'b1);
check_reg(3'd4, 8'h00, "SUB R-R=0");
check_flags(4'b1000, "SUB R-R flags Z");
// ---- AND / OR / XOR ----
loadi(3'd1, 8'hF0);
loadi(3'd2, 8'h0F);
alu_rr(OP_AND, 3'd3, 3'd1, 3'd2, 1'b1);
check_reg(3'd3, 8'h00, "AND F0 & 0F = 0");
check_flags(4'b1000, "AND zero flag");
alu_rr(OP_OR, 3'd3, 3'd1, 3'd2, 1'b1);
check_reg(3'd3, 8'hFF, "OR F0|0F = FF");
check_flags(4'b0100, "OR negative flag");
alu_ri(OP_XOR, 3'd3, 3'd1, 8'hAA, 1'b1);
check_reg(3'd3, 8'h5A, "XOR F0^AA = 5A");
// ---- SHL / SHR / SAR ----
loadi(3'd1, 8'h81);
alu_rr(OP_SHL, 3'd2, 3'd1, 3'd0, 1'b1);
check_reg(3'd2, 8'h02, "SHL 0x81 = 0x02");
check_flags(4'b0010, "SHL out-bit → C"); // C=top bit before shift
alu_rr(OP_SHR, 3'd2, 3'd1, 3'd0, 1'b1);
check_reg(3'd2, 8'h40, "SHR 0x81 = 0x40");
check_flags(4'b0010, "SHR out-bit → C"); // C=bottom bit before
alu_rr(OP_SAR, 3'd2, 3'd1, 3'd0, 1'b1);
check_reg(3'd2, 8'hC0, "SAR 0x81 = 0xC0");
check_flags(4'b0110, "SAR keeps sign, N=1 C=1");
// ---- NOT ----
loadi(3'd1, 8'h55);
alu_rr(OP_NOT, 3'd2, 3'd1, 3'd0, 1'b1);
check_reg(3'd2, 8'hAA, "NOT 0x55 = 0xAA");
check_flags(4'b0100, "NOT 0x55 flag N");
// ---- ADC / SBC ----
// First do an ADD that sets C=1, then ADC reads it.
loadi(3'd1, 8'hFF);
alu_ri(OP_ADD, 3'd2, 3'd1, 8'h01, 1'b1); // sets C=1, R2=0
check_reg(3'd2, 8'h00, "setup C=1: 0xFF+1 wraps");
// R3 = R0 + R0 + C = 0 + 0 + 1 = 1
alu_rr(OP_ADC, 3'd3, 3'd0, 3'd0, 1'b1);
check_reg(3'd3, 8'h01, "ADC R3=0+0+C");
// After ADC, flags should reflect ADC: Z=0 N=0 C=0 V=0
check_flags(4'b0000, "ADC flags fresh");
// Set C=1 again then SBC: R4 = R0 - R0 - 1 = 0xFF, borrow out C=1
loadi(3'd1, 8'hFF);
alu_ri(OP_ADD, 3'd2, 3'd1, 8'h01, 1'b1); // C=1
alu_rr(OP_SBC, 3'd4, 3'd0, 3'd0, 1'b1);
check_reg(3'd4, 8'hFF, "SBC R4=0-0-C=0xFF");
check_flags(4'b0110, "SBC borrow flags");
// ---- write-enable gating ----
// we=0 must NOT update the destination register.
loadi(3'd1, 8'h11);
step(OP_ADD, 3'd1, 3'd0, 3'd1, 8'h22, 1'b1, 1'b0, 1'b0);
check_reg(3'd1, 8'h11, "we=0 leaves R1 alone");
// flag_we=0 must NOT update flags.
// First force flags to a known value (Z=1 after R-R=0).
loadi(3'd1, 8'h42);
alu_rr(OP_SUB, 3'd2, 3'd1, 3'd1, 1'b1);
check_flags(4'b1000, "flags pre-test = Z");
// Now do something that would change flags, but with flag_we=0.
step(OP_ADD, 3'd1, 3'd0, 3'd2, 8'hFF, 1'b1, 1'b1, 1'b0);
check_flags(4'b1000, "flag_we=0 holds flags");
// ---- summary ----
if (errors == 0) $display("PASS: tiny ALU datapath, all checks ok.");
else $display("FAIL: %0d errors", errors);
$finish;
end
// safety net
initial begin
#1_000_000;
$display("FAIL: testbench timed out");
$finish;
end
endmodule
`default_nettype wire The demo TB runs a register-trace “program” — the first eight
Fibonacci numbers, then a couple of bitwise ops, then a 0xFF + 1
to show carry-out:
// Project 05 demo testbench — runs a small "program" through the ALU
// datapath and prints each step in a register-trace style. Like
// `make demo` for the earlier projects: it doesn't PASS/FAIL anything,
// it just shows the chip doing something.
//
// The "program" computes the first eight Fibonacci numbers using the
// register file as scratch space, then does a couple of bitwise ops
// to demonstrate flag behavior:
//
// R1 = 1 (loadi)
// R2 = 1 (loadi)
// R3 = R1 + R2 = 2
// R4 = R2 + R3 = 3
// R5 = R3 + R4 = 5
// R6 = R4 + R5 = 8
// R7 = R5 + R6 = 13
//
// -- bitwise --
// R1 = 0xCA, R2 = 0xFE, R3 = R1 & R2, R4 = R1 | R2, R5 = R1 ^ R2
//
// Style choice: keep it pure plain SV, no $display formatting tricks
// beyond the printf-ish `%h`/`%b`.
`timescale 1ns/1ps
`default_nettype none
module tb_demo;
logic clk = 0;
always #5 clk = ~clk;
logic rst_n;
logic [3:0] op;
logic [2:0] ra, rb, rd;
logic [7:0] imm;
logic use_imm, we, flag_we;
logic [7:0] result, obs;
logic [3:0] flags;
top dut (
.clk(clk), .rst_n(rst_n),
.op(op), .ra(ra), .rb(rb), .rd(rd),
.imm(imm), .use_imm(use_imm), .we(we), .flag_we(flag_we),
.result(result), .obs(obs), .flags(flags)
);
localparam logic [3:0] OP_ADD=4'b0000, OP_SUB=4'b0001, OP_AND=4'b0010;
localparam logic [3:0] OP_OR =4'b0011, OP_XOR=4'b0100, OP_MOV=4'b1000;
task automatic step(
input logic [3:0] op_i, input logic [2:0] ra_i,
input logic [2:0] rb_i, input logic [2:0] rd_i,
input logic [7:0] imm_i, input logic use_imm_i,
input logic we_i, input logic flag_we_i
);
begin
@(negedge clk);
op = op_i; ra = ra_i; rb = rb_i; rd = rd_i;
imm = imm_i; use_imm = use_imm_i; we = we_i; flag_we = flag_we_i;
@(posedge clk);
#1;
end
endtask
// ADD R0 + imm puts imm into rd; using MOV here wouldn't work since
// MOV is a-passthrough and we want the immediate.
task automatic loadi(input logic [2:0] r, input logic [7:0] v);
step(OP_ADD, 3'd0, 3'd0, r, v, 1'b1, 1'b1, 1'b0);
endtask
task automatic peek(input logic [2:0] r);
step(OP_MOV, r, 3'd0, 3'd0, 8'h00, 1'b0, 1'b0, 1'b0);
endtask
task automatic peek_print(input logic [2:0] r, input string label);
begin
peek(r);
$display("[alu] %s R%0d = 0x%02h (%0d)", label, r, obs, obs);
end
endtask
task automatic do_add_rr(input logic [2:0] rd_i,
input logic [2:0] ra_i,
input logic [2:0] rb_i,
input string note);
begin
step(OP_ADD, ra_i, rb_i, rd_i, 8'h00, 1'b0, 1'b1, 1'b1);
$display("[alu] R%0d = R%0d + R%0d → 0x%02h (Z=%0d N=%0d C=%0d V=%0d) %s",
rd_i, ra_i, rb_i, result,
flags[3], flags[2], flags[1], flags[0], note);
end
endtask
task automatic do_op_rr(input logic [3:0] op_i,
input string opname,
input logic [2:0] rd_i,
input logic [2:0] ra_i,
input logic [2:0] rb_i);
begin
step(op_i, ra_i, rb_i, rd_i, 8'h00, 1'b0, 1'b1, 1'b1);
$display("[alu] R%0d = R%0d %s R%0d → 0x%02h (Z=%0d N=%0d)",
rd_i, ra_i, opname, rb_i, result, flags[3], flags[2]);
end
endtask
initial begin
$dumpfile("tb_demo.vcd");
$dumpvars(0, tb_demo);
op = 0; ra = 0; rb = 0; rd = 0; imm = 0;
use_imm = 0; we = 0; flag_we = 0;
rst_n = 0;
repeat (3) @(posedge clk);
@(negedge clk); rst_n = 1;
$display("[alu]");
$display("[alu] -- librelane-playground / project 05 / tiny ALU datapath --");
$display("[alu] ops: ADD SUB AND OR XOR SHL SHR SAR MOV NOT ADC SBC");
$display("[alu] regfile: 8 × 8b, R0 hardwired to 0");
$display("[alu]");
$display("[alu] fibonacci(8) using R1..R7:");
loadi(3'd1, 8'h01); peek_print(3'd1, "loadi");
loadi(3'd2, 8'h01); peek_print(3'd2, "loadi");
do_add_rr(3'd3, 3'd1, 3'd2, "");
do_add_rr(3'd4, 3'd2, 3'd3, "");
do_add_rr(3'd5, 3'd3, 3'd4, "");
do_add_rr(3'd6, 3'd4, 3'd5, "");
do_add_rr(3'd7, 3'd5, 3'd6, "");
$display("[alu]");
$display("[alu] bitwise on 0xCA and 0xFE:");
loadi(3'd1, 8'hCA);
loadi(3'd2, 8'hFE);
do_op_rr(OP_AND, "&", 3'd3, 3'd1, 3'd2);
do_op_rr(OP_OR, "|", 3'd4, 3'd1, 3'd2);
do_op_rr(OP_XOR, "^", 3'd5, 3'd1, 3'd2);
$display("[alu]");
$display("[alu] carry-out demo: 0xFF + 0x01:");
loadi(3'd1, 8'hFF);
step(OP_ADD, 3'd1, 3'd0, 3'd2, 8'h01, 1'b1, 1'b1, 1'b1);
$display("[alu] R2 = R1 + 1 → 0x%02h (Z=%0d N=%0d C=%0d V=%0d)",
result, flags[3], flags[2], flags[1], flags[0]);
$display("[alu]");
$finish;
end
endmodule
`default_nettype wire See also
- Project 04 → previous step.
- Project README