No. 15 / project of 147 on the ladder

RV32I official arch-test smoke

introduces — official RISC-V arch-test harness, self-check signatures, SRAM-backed memory bus

harden statelast run2026-04-30
cells54,646non-filler
slack14.09ns setup
area4420000 (die) / 4329320 (core)μm²
signoff
  • DRCPARTIAL
  • LVSPASS
  • antennaPASS

P15 is the first rung where the word “official” finally means something concrete. We run an upstream riscv-arch-test source file through the official framework macros, load the resulting ELF image into our RTL simulator, and get a real PASS.

Status: hardened with macro DRC waiver. Official rv32i/I/I-nop-00.S from upstream riscv-arch-test revision a7c9930 builds in RVTEST_SELFCHECK mode and passes on the P15 RTL under Icarus Verilog. The current harden top uses four 2 KiB OpenRAM SRAM macros for 8 KiB total memory. LibreLane run RUN_2026-04-30_10-05-37 produced final GDS, passed LVS, antenna, setup timing, and hold timing. Raw Magic DRC is FAIL because Magic reports 22294402 OpenRAM macro DRC errors; KLayout DRC was NOT RUN. We treat the SRAM as trusted hard IP, matching the P08 macro policy, and move on.

The target

The first acceptance target is intentionally tiny:

fieldvalue
Official sourcerv32i/I/I-nop-00.S
Upstream revisiona7c9930
Build modeRVTEST_SELFCHECK
SimulatorIcarus Verilog
Core-only resultPASS
SRAM-top resultPASS
Latest harden runPASS with trusted OpenRAM macro DRC waiver

Run it:

make -C projects/15_rv32i_arch_test/test

The expected final lines include both acceptance paths:

PASS: P15 official rv32i/I/I-nop-00.S self-check accepted after 761 clocks
PASS: P15 SRAM-backed top acceptance smoke complete.

Why P14 could not do this

P14’s local ISA smoke test was useful, but the official framework needs more than a 32-word TT demo core can offer. Even I-nop-00.S pulls in hundreds of instructions of setup, uses all 32 integer registers, and needs a signature/data region. P14 had 16 usable registers, 32 instruction slots, 8 data words, and no signature export path.

P15 changes the shape:

thingP14P15
RegistersRV32E-style x0..x15RV32I x0..x31
Memory32 instruction words + 8 data wordsCPU memory bus, 8 KiB SRAM target
Loads/storesLW, SW onlybyte, halfword, and word loads/stores
Official test state39 built, 0 runnable1 built, 1 run, 1 PASS

The first backend attempt answered the obvious question the hard way: the official 8 KiB memory cannot live inside this learning core as flops. RUN_2026-04-29_15-49-08 was stopped as PARTIAL after Yosys expanded the memory into a huge register/mux problem.

The bus-shell checkpoint then completed with only a 16-byte halt-loop placeholder. That was useful, and RUN_2026-04-30_06-20-57 has clean DRC/LVS/antenna/timing, but it did not carry the 8 KiB memory we wanted.

The current top carries that memory with four 2 KiB OpenRAM macros:

checkresult
Run directoryprojects/15_rv32i_arch_test/librelane/runs/RUN_2026-04-30_10-05-37
Final GDSprojects/15_rv32i_arch_test/librelane/runs/RUN_2026-04-30_10-05-37/final/gds/top.gds
Metricsprojects/15_rv32i_arch_test/librelane/runs/RUN_2026-04-30_10-05-37/final/metrics.json
Route DRCPASS (0)
Magic DRCFAIL (22294402)
KLayout DRCNOT RUN
Macro DRC dispositionPARTIAL: trusted OpenRAM macro waiver
LVSPASS (0)
AntennaPASS (0)
Setup / holdPASS / PASS
Max slew / cap checkerPARTIAL: 4209 slew and 156 cap warnings

That is the honest line: the SRAM-backed chip is physically routed and connected. The standard-cell integration is clean enough to move on; raw macro-internal DRC is waived as trusted SRAM IP.

The SRAM bus

The CPU still speaks one byte-addressed request/response bus. The harden top now attaches an 8 KiB SRAM device to that bus.

projects/15_rv32i_arch_test/src/top.sv system-verilog · L19-190
  wire        mem_valid;
  wire        mem_we;
  wire [1:0]  mem_size;
  wire [31:0] mem_addr;
  wire [31:0] mem_wdata;
  wire [3:0]  mem_wstrb;
  wire [31:0] mem_rdata;
  wire        mem_ready;
  wire        mem_error;

  p15_rv32i_arch_core u_core (
    .clk        (clk),
    .rst_n      (rst_n),
    .mem_valid  (mem_valid),
    .mem_we     (mem_we),
    .mem_size   (mem_size),
    .mem_addr   (mem_addr),
    .mem_wdata  (mem_wdata),
    .mem_wstrb  (mem_wstrb),
    .mem_rdata  (mem_rdata),
    .mem_ready  (mem_ready),
    .mem_error  (mem_error),
    .pc_out     (pc_out),
    .x5_out     (x5_out),
    .halted     (halted),
    .illegal    (illegal)
  );

  p15_sram8k_bus_memory u_mem (
    .clk       (clk),
    .rst_n     (rst_n),
    .valid     (mem_valid),
    .we        (mem_we),
    .size      (mem_size),
    .addr      (mem_addr),
    .wdata     (mem_wdata),
    .wstrb     (mem_wstrb),
    .rdata     (mem_rdata),
    .ready     (mem_ready),
    .error     (mem_error)
  );

endmodule


module p15_sram8k_bus_memory (
    input  logic        clk,
    input  logic        rst_n,
    input  logic        valid,
    input  logic        we,
    input  logic [1:0]  size,
    input  logic [31:0] addr,
    input  logic [31:0] wdata,
    input  logic [3:0]  wstrb,
    output logic [31:0] rdata,
    output logic        ready,
    output logic        error
);

  localparam int MEM_BYTES = 8192;

  typedef enum logic [1:0] {
    M_IDLE = 2'd0,
    M_WAIT = 2'd1,
    M_RESP = 2'd2
  } mem_state_t;

  mem_state_t state;

  logic [1:0]  req_size;
  logic [1:0]  req_offset;
  logic [1:0]  req_bank;
  logic        req_we;
  logic        req_error;
  logic [31:0] rdata_q;
  logic        error_q;

  logic [31:0] last_addr;
  always_comb begin
    case (size)
      2'd0:    last_addr = addr;
      2'd1:    last_addr = addr + 32'd1;
      default: last_addr = addr + 32'd3;
    endcase
  end

  wire addr_in_range = (addr < MEM_BYTES) && (last_addr < MEM_BYTES);
  wire access_aligned = (size == 2'd0) ||
                        (size == 2'd1 && addr[0] == 1'b0) ||
                        (size == 2'd2 && addr[1:0] == 2'b00);
  wire req_error_now = valid && (!addr_in_range || !access_aligned);
  wire accept = valid && (state == M_IDLE);

  wire [1:0] bank_now = addr[12:11];
  wire [8:0] word_addr_now = addr[10:2];
  wire [1:0] byte_offset_now = addr[1:0];

  function automatic logic [3:0] lane_mask(
    input logic [1:0] access_size,
    input logic [1:0] byte_offset
  );
    begin
      case (access_size)
        2'd0:    lane_mask = 4'b0001 << byte_offset;
        2'd1:    lane_mask = 4'b0011 << byte_offset;
        default: lane_mask = 4'b1111;
      endcase
    end
  endfunction

  function automatic logic [31:0] aligned_wdata(
    input logic [1:0] access_size,
    input logic [1:0] byte_offset,
    input logic [31:0] raw_wdata
  );
    begin
      aligned_wdata = 32'h0;
      case (access_size)
        2'd0:    aligned_wdata = {24'h0, raw_wdata[7:0]} << (8 * byte_offset);
        2'd1:    aligned_wdata = {16'h0, raw_wdata[15:0]} << (8 * byte_offset);
        default: aligned_wdata = raw_wdata;
      endcase
    end
  endfunction

  function automatic logic [31:0] packed_rdata(
    input logic [1:0] access_size,
    input logic [1:0] byte_offset,
    input logic [31:0] raw_rdata
  );
    begin
      packed_rdata = 32'h0;
      case (access_size)
        2'd0: begin
          case (byte_offset)
            2'd0:    packed_rdata[7:0] = raw_rdata[7:0];
            2'd1:    packed_rdata[7:0] = raw_rdata[15:8];
            2'd2:    packed_rdata[7:0] = raw_rdata[23:16];
            default: packed_rdata[7:0] = raw_rdata[31:24];
          endcase
        end
        2'd1: begin
          if (byte_offset[1]) packed_rdata[15:0] = raw_rdata[31:16];
          else                packed_rdata[15:0] = raw_rdata[15:0];
        end
        default: packed_rdata = raw_rdata;
      endcase
    end
  endfunction

  wire [3:0]  macro_wmask = lane_mask(size, byte_offset_now);
  wire [31:0] macro_wdata = aligned_wdata(size, byte_offset_now, wdata);

  wire bank0_sel = accept && !req_error_now && bank_now == 2'd0;
  wire bank1_sel = accept && !req_error_now && bank_now == 2'd1;
  wire bank2_sel = accept && !req_error_now && bank_now == 2'd2;
  wire bank3_sel = accept && !req_error_now && bank_now == 2'd3;

  wire [31:0] bank0_rdata;
  wire [31:0] bank1_rdata;
  wire [31:0] bank2_rdata;
  wire [31:0] bank3_rdata;

  p15_sram2k_bank u_bank0 (
    .clk   (clk),
    .re    (bank0_sel && !we),
    .we    (bank0_sel && we),
    .wmask (macro_wmask),
    .addr  (word_addr_now),
    .wdata (macro_wdata),
    .rdata (bank0_rdata)
  );

Each bank is one OpenRAM macro:

projects/15_rv32i_arch_test/src/top.sv system-verilog · L290-320
module p15_sram2k_bank (
    input  logic        clk,
    input  logic        re,
    input  logic        we,
    input  logic [3:0]  wmask,
    input  logic [8:0]  addr,
    input  logic [31:0] wdata,
    output logic [31:0] rdata
);

  wire csb0 = ~(re | we);
  wire web0 = ~we;
  wire [31:0] unused_dout1;

  /* verilator lint_off PINMISSING */
  sky130_sram_2kbyte_1rw1r_32x512_8 u_macro (
    .clk0   (clk),
    .csb0   (csb0),
    .web0   (web0),
    .wmask0 (wmask),
    .addr0  (addr),
    .din0   (wdata),
    .dout0  (rdata),
    .clk1   (clk),
    .csb1   (1'b1),
    .addr1  (9'h000),
    .dout1  (unused_dout1)
  );
  /* verilator lint_on PINMISSING */

  wire _unused = &{1'b0, unused_dout1};

The physical SRAM is not initialized by reset. The SRAM-top testbench preloads the macro models only for simulation:

projects/15_rv32i_arch_test/test/tb_sram_top.sv system-verilog · L73-112
  initial begin
    $dumpfile("tb_sram_top.vcd");
    $dumpvars(0, tb_sram_top);

    for (int i = 0; i < MEM_BYTES; i++) image[i] = 8'h00;
    $readmemh(MEM_HEX, image);

    preload_bank(0, 0);
    preload_bank(1, 2048);
    preload_bank(2, 4096);
    preload_bank(3, 6144);

    rst_n = 1'b0;
    repeat (6) @(posedge clk);
    @(negedge clk); rst_n = 1'b1;

    begin
      int n;
      n = 0;
      while (!halted && n < 200_000) begin
        @(posedge clk);
        n = n + 1;
      end

      if (!halted) begin
        $display("FAIL: P15 SRAM top I-nop-00.S timed out at pc=0x%08h x5=0x%08h", pc, x5);
        errors = errors + 1;
      end else if (illegal) begin
        $display("FAIL: P15 SRAM top I-nop-00.S hit unsupported instruction or memory access at pc=0x%08h x5=0x%08h", pc, x5);
        errors = errors + 1;
      end else if (x5 !== 32'd1) begin
        $display("FAIL: P15 SRAM top I-nop-00.S halted with x5=0x%08h, expected PASS code 1", x5);
        errors = errors + 1;
      end else begin
        $display("PASS: P15 SRAM top I-nop-00.S self-check accepted after %0d clocks", n);
      end
    end

    if (errors == 0) $display("PASS: P15 SRAM-backed top acceptance smoke complete.");
    else             $display("FAIL: P15 SRAM-backed top acceptance smoke saw %0d errors.", errors);

The LibreLane config pins those four macros into a larger floorplan:

projects/15_rv32i_arch_test/librelane/config.yaml yaml · L21-66
FP_SIZING:    absolute
DIE_AREA:     [0, 0, 2600, 1700]
CORE_AREA:    [10, 10, 2590, 1690]
FP_CORE_UTIL: 25

VDD_NETS:
  - vccd1
GND_NETS:
  - vssd1

MACROS:
  sky130_sram_2kbyte_1rw1r_32x512_8:
    gds:
      - pdk_dir::libs.ref/sky130_sram_macros/gds/sky130_sram_2kbyte_1rw1r_32x512_8.gds
    lef:
      - pdk_dir::libs.ref/sky130_sram_macros/lef/sky130_sram_2kbyte_1rw1r_32x512_8.lef
    nl:
      - pdk_dir::libs.ref/sky130_sram_macros/verilog/sky130_sram_2kbyte_1rw1r_32x512_8.v
    lib:
      "*":
        - pdk_dir::libs.ref/sky130_sram_macros/lib/sky130_sram_2kbyte_1rw1r_32x512_8_TT_1p8V_25C.lib
    instances:
      u_mem.u_bank0.u_macro:
        location: [120, 120]
        orientation: N
      u_mem.u_bank1.u_macro:
        location: [1750, 120]
        orientation: N
      u_mem.u_bank2.u_macro:
        location: [120, 1020]
        orientation: N
      u_mem.u_bank3.u_macro:
        location: [1750, 1020]
        orientation: N

PDN_MACRO_CONNECTIONS:
  - "u_mem.u_bank0.u_macro vccd1 vssd1 vccd1 vssd1"
  - "u_mem.u_bank1.u_macro vccd1 vssd1 vccd1 vssd1"
  - "u_mem.u_bank2.u_macro vccd1 vssd1 vccd1 vssd1"
  - "u_mem.u_bank3.u_macro vccd1 vssd1 vccd1 vssd1"

# The bundled OpenRAM macros have known DRC deck mismatches. Match P08's
# macro-integration setup: complete layout generation, LVS, antenna, STA,
# and metrics, but do not fail the run on macro-internal DRC.
ERROR_ON_MAGIC_DRC: false
RUN_KLAYOUT_DRC: false

ISA scope

Supported in this RTL step: LUI, AUIPC, JAL, JALR, BEQ, BNE, BLT, BGE, BLTU, BGEU, LB, LH, LW, LBU, LHU, SB, SH, SW, ADDI, SLTI, SLTIU, XORI, ORI, ANDI, SLLI, SRLI, SRAI, ADD, SUB, SLL, SLT, SLTU, XOR, SRL, SRA, OR, AND, and FENCE as a no-op.

Unsupported: traps, exceptions, interrupts, CSRs, ECALL, EBREAK, misalignment trap handling, FENCE.I, multiply/divide, atomics, compressed instructions, privilege modes, and any official tests not listed above as actually run.

What this proves

It does not prove RV32I compliance. It proves one official unprivileged integer source file can run to PASS on our RTL, and the same image can run through the top-level SRAM bus in simulation.

On the physical side, it proves the 8 KiB SRAM-backed shape can route, close setup/hold, pass antenna, and pass LVS. It does not prove raw full-chip DRC PASS; it uses the same trusted OpenRAM macro DRC waiver policy as P08.