Arch-test · librelane-playground

People say “we ran the compliance suite” like it’s a button. It isn’t. riscv-arch-test is closer to a contract: a pile of test sources, a trusted reference model, and a set of macros every CPU under test (DUT) is expected to implement. The framework only works if you bring the runner.

This page walks through what the contract is, what generates the “correct” answers, and how our existing harness already implements the DUT side — something we did not always describe accurately on the site.

§ What riscv-arch-test is

The upstream repo lives at riscv-non-isa/riscv-arch-test. What’s inside:

Test sources. Hundreds of small assembly programs, organized by base ISA and extension: rv32i_m/I/, rv32i_m/M/, rv32i_m/Zicsr/, rv32i_m/privilege/, and so on. Each program targets one instruction or one architectural behavior.
Reference outputs. For each test source, a corresponding *.reference_output file containing the exact bytes the test is expected to write into a “signature region” of memory — generated ahead of time by running the test on the formal model.
Macros. A header (arch_test.h) of test-author macros like RVTEST_CASE, TEST_RR_OP, TEST_LOAD, used to generate test bodies in a standardized way.
A DUT contract. A file every target must provide called rvmodel_macros.h (older versions called it model_test.h), which fills in the missing macros: RVMODEL_BOOT, RVMODEL_HALT_PASS / _FAIL, RVMODEL_DATA_BEGIN, RVMODEL_DATA_END. The framework intentionally doesn’t know how your CPU starts up, halts, or reports its signature — that’s the integrator’s job.
A target config directory. A test-config YAML, a linker script, and a sail.json (or equivalent) sat alongside rvmodel_macros.h. Modern versions live under config/cores/<vendor>/<core>/. The framework’s act generator reads the config and produces self-checking ELFs targeted at the DUT.

What it isn’t: a runner, a simulator, an FPGA bitstream, or a self-contained app. There is no riscv-arch-test command that “runs the tests on your chip.” You build the runner.

§ The Sail role

Every *.reference_output file in the suite is generated by running the corresponding test on Sail RISC-V — the formal-model implementation curated by the official RISC-V foundation. Sail is a domain-specific language for ISA semantics; the RISC-V Sail spec is effectively the executable version of the architecture manual.

The flow is:

Test author writes the assembly source using the standard macros.
They run that source on the Sail model, which dumps the signature region.
That signature is checked into the repo as *.reference_output.
Anyone who later runs the same test on a real DUT must produce bit-identical bytes in their signature region — or the test fails.

This is why the suite is so strict. The reference isn’t “what the author thinks should happen”; it’s “what the formal spec computes, recorded as bytes.” A DUT that disagrees on a single byte fails the test, even if every visible state otherwise matched.

§ The signature loop

Every test follows the same shape:

The arch-test loop — every test produces the same kind of artifact and is judged the same way.

The framework’s job stops at “you provide a way to compile and run each ELF, and to produce a signature file that looks like the reference.” Everything else — your simulator, your testbench, your trap handler, your halt mechanism — is yours to wire up.

§ The DUT contract, in detail

A working integration is three small files plus your existing CPU runner:

rvmodel_macros.h — the macros the framework needs, filled in for your CPU:

RVMODEL_BOOT — assembly that runs at reset before each test. Set up mtvec, configure interrupts off, jump to test entry.
RVMODEL_HALT_PASS / RVMODEL_HALT_FAIL — assembly that runs when a test reports it’s done. For us, this writes 1/3 to tohost, sets x5 = 1/31, then drops into the halt loop our DUT detects (jal x0, 0).
RVMODEL_DATA_BEGIN / RVMODEL_DATA_END — labels that bracket the signature region in the linker script. The test’s RVMODEL_IO_* macros write into this region; whatever’s between the two labels at halt time is the dumped signature (used by the canonical signature-diff verification mode).
RVMODEL_IO_INIT / RVMODEL_IO_WRITE_STR / RVMODEL_IO_ASSERT_* — optional debug reporting. We treat them as no-ops.

A linker script (link.ld) — places .text.init at the DUT’s reset address, lays out the rest of the code, data, and .tohost section. For us, text starts at 0x00000000 to match the external memory model.

A test-config YAML — points the framework at the right compiler, reference-model executable, and DUT plugin directory. We pass ours via --act4-dir to the runner script.

Optionally, a signature comparator — a tiny Python or shell script that reads the dumped memory region and diffs against *.reference_output. We don’t use this today; the in-ELF self-check covers our test population.

§ What we do today (and have been doing since P15)

Our compliance harness already runs the upstream framework with a custom DUT plugin. The naming gets confusing because we call the generated ELFs ACT4 ELFs (after the upstream act generator), so it sounds like a parallel project — it isn’t.

our piece	what it actually is
`scripts/p17_act4_batch.py`	the runner that invokes the upstream `act` generator and then runs each ELF on our DUT
`projects/38_arch_test_official/arch_test/rvmodel_macros.h`	the DUT plugin’s halt + IO macros, formerly buried under `projects/26_rv32i_act4_probe/act4/`
`projects/38_arch_test_official/arch_test/link.ld`	the DUT plugin’s linker script, placing text/data/.tohost at `0x00000000`
`tb_external_mem.sv` + `top.sv`	the DUT
upstream `tests/rv32i_m/<EXT>/`	unmodified upstream test sources
upstream `config/sail/sail-rv32-max/sail.json`	the Sail reference config we patch into a P17-compatible memory map
Sail at `/tmp/sail-riscv-0.10/`	the reference model — signatures match its execution

The halt convention is a small custom touch. Our DUT recognises jal x0, 0 (machine code 0x0000006f) as a halt loop. The plugin’s RVMODEL_HALT_PASS writes 1 to tohost (so Sail also stops), sets x5 = 1 (so the testbench can report PASS without diffing a signature), then drops into the halt loop.

The reason results still say scoped ACT4/Sail is not “we wrote our own runner” — it’s that the test population we run is a curated subset (rv32i/I, rv32i/M, rv32i/Zicsr, rv32i/Zifencei), and the in-ELF self-check is one of two acceptable verification modes. The canonical one — diffing a dumped signature region against the framework’s *.reference_output — is something we have started infrastructure for (P38 ships a tb_arch_test.sv signature-dump testbench), but that path is not yet the default.

§ What P38 actually changes

The earlier framing on this page suggested that “switching to the upstream framework” was a future rung. That was wrong. The accurate picture:

We have always been on the upstream framework. The DUT plugin moved from a buried directory to a named one in P38.
One make arch_test aggregate target now runs all scoped batches together and writes a single SWEEP.md. Same numbers as P37 (39 + 8 + 6 + 1), reported from one place.
P39 added a fifth batch (rv32i/Zicntr, 2 tests at upstream rev a7c9930); the recorded aggregate is now PASS=56 FAIL=0 NOT RUN=0.
The tb_arch_test.sv signature-dump testbench is wired up but not yet used by the default flow — it is infrastructure for any future rung that wants the canonical signature-diff verification mode (e.g. tests where the in-ELF self-check is not available).

§ What the next credibility step would be

The remaining work to push this from “scoped” to something stronger:

Widen the test population. Run rv32i/privilege, rv32i/Zicntr (after P39), rv32i/ExceptionsTraps, and the per-instruction Sm-class sub-suites we currently exclude.
Switch verification to the canonical signature-diff mode. Use tb_arch_test.sv to dump RVMODEL_DATA_BEGIN / _END regions and diff against *.reference_output. This catches classes of bugs the in-ELF self-check ignores by design (e.g. sign-extended write of the right value to the wrong address).
Run a published profile. Today we run the sail-RVI20U32 / sail-rv32-max configs; running a named profile like RVA20U32 (when our extensions reach it) is what compliance language actually means.

None of those are part of P38. They are reasons to keep saying scoped ACT4/Sail on the result tables.