People say “we ran the compliance suite” like it’s a button. It isn’t.
riscv-arch-test is closer to a contract: a pile of test sources, a
trusted reference model, and a set of macros every CPU under test
(DUT) is expected to implement. The framework only works
if you bring the runner.
This page walks through what the contract is, what generates the “correct” answers, and how our existing harness already implements the DUT side — something we did not always describe accurately on the site.
§ What riscv-arch-test is
The upstream repo lives at riscv-non-isa/riscv-arch-test. What’s
inside:
- Test sources. Hundreds of small assembly programs, organized by
base ISA and extension:
rv32i_m/I/,rv32i_m/M/,rv32i_m/Zicsr/,rv32i_m/privilege/, and so on. Each program targets one instruction or one architectural behavior. - Reference outputs. For each test source, a corresponding
*.reference_outputfile containing the exact bytes the test is expected to write into a “signature region” of memory — generated ahead of time by running the test on the formal model. - Macros. A header (
arch_test.h) of test-author macros likeRVTEST_CASE,TEST_RR_OP,TEST_LOAD, used to generate test bodies in a standardized way. - A DUT contract. A file every target must provide called
rvmodel_macros.h(older versions called itmodel_test.h), which fills in the missing macros:RVMODEL_BOOT,RVMODEL_HALT_PASS/_FAIL,RVMODEL_DATA_BEGIN,RVMODEL_DATA_END. The framework intentionally doesn’t know how your CPU starts up, halts, or reports its signature — that’s the integrator’s job. - A target config directory. A test-config YAML, a linker script,
and a sail.json (or equivalent) sat alongside
rvmodel_macros.h. Modern versions live underconfig/cores/<vendor>/<core>/. The framework’sactgenerator reads the config and produces self-checking ELFs targeted at the DUT.
What it isn’t: a runner, a simulator, an FPGA bitstream, or a
self-contained app. There is no riscv-arch-test command that “runs
the tests on your chip.” You build the runner.
§ The Sail role
Every *.reference_output file in the suite is generated by running
the corresponding test on Sail RISC-V — the formal-model
implementation curated by the official RISC-V foundation. Sail is
a domain-specific language for ISA semantics; the RISC-V Sail spec
is effectively the executable version of the architecture manual.
The flow is:
- Test author writes the assembly source using the standard macros.
- They run that source on the Sail model, which dumps the signature region.
- That signature is checked into the repo as
*.reference_output. - Anyone who later runs the same test on a real DUT must produce bit-identical bytes in their signature region — or the test fails.
This is why the suite is so strict. The reference isn’t “what the author thinks should happen”; it’s “what the formal spec computes, recorded as bytes.” A DUT that disagrees on a single byte fails the test, even if every visible state otherwise matched.
§ The signature loop
Every test follows the same shape:
The framework’s job stops at “you provide a way to compile and run each ELF, and to produce a signature file that looks like the reference.” Everything else — your simulator, your testbench, your trap handler, your halt mechanism — is yours to wire up.
§ The DUT contract, in detail
A working integration is three small files plus your existing CPU runner:
rvmodel_macros.h — the macros the framework needs, filled in
for your CPU:
RVMODEL_BOOT— assembly that runs at reset before each test. Set upmtvec, configure interrupts off, jump to test entry.RVMODEL_HALT_PASS/RVMODEL_HALT_FAIL— assembly that runs when a test reports it’s done. For us, this writes1/3totohost, setsx5 = 1/31, then drops into the halt loop our DUT detects (jal x0, 0).RVMODEL_DATA_BEGIN/RVMODEL_DATA_END— labels that bracket the signature region in the linker script. The test’sRVMODEL_IO_*macros write into this region; whatever’s between the two labels at halt time is the dumped signature (used by the canonical signature-diff verification mode).RVMODEL_IO_INIT/RVMODEL_IO_WRITE_STR/RVMODEL_IO_ASSERT_*— optional debug reporting. We treat them as no-ops.
A linker script (link.ld) — places .text.init at the DUT’s
reset address, lays out the rest of the code, data, and .tohost
section. For us, text starts at 0x00000000 to match the external
memory model.
A test-config YAML — points the framework at the right compiler,
reference-model executable, and DUT plugin directory. We pass ours via
--act4-dir to the runner script.
Optionally, a signature comparator — a tiny Python or shell script
that reads the dumped memory region and diffs against
*.reference_output. We don’t use this today; the in-ELF self-check
covers our test population.
§ What we do today (and have been doing since P15)
Our compliance harness already runs the upstream framework with a
custom DUT plugin. The naming gets confusing because we call the
generated ELFs ACT4 ELFs (after the upstream act
generator), so it sounds like a parallel project — it isn’t.
| our piece | what it actually is |
|---|---|
scripts/p17_act4_batch.py | the runner that invokes the upstream act generator and then runs each ELF on our DUT |
projects/38_arch_test_official/arch_test/rvmodel_macros.h | the DUT plugin’s halt + IO macros, formerly buried under projects/26_rv32i_act4_probe/act4/ |
projects/38_arch_test_official/arch_test/link.ld | the DUT plugin’s linker script, placing text/data/.tohost at 0x00000000 |
tb_external_mem.sv + top.sv | the DUT |
upstream tests/rv32i_m/<EXT>/ | unmodified upstream test sources |
upstream config/sail/sail-rv32-max/sail.json | the Sail reference config we patch into a P17-compatible memory map |
Sail at /tmp/sail-riscv-0.10/ | the reference model — signatures match its execution |
The halt convention is a small custom touch. Our DUT recognises
jal x0, 0 (machine code 0x0000006f) as a halt loop. The plugin’s
RVMODEL_HALT_PASS writes 1 to tohost (so Sail also stops),
sets x5 = 1 (so the testbench can report PASS without diffing a
signature), then drops into the halt loop.
The reason results still say scoped ACT4/Sail is not “we wrote our
own runner” — it’s that the test population we run is a curated subset
(rv32i/I, rv32i/M, rv32i/Zicsr, rv32i/Zifencei), and the
in-ELF self-check is one of two acceptable verification modes. The
canonical one — diffing a dumped signature region against the
framework’s *.reference_output — is something we have started
infrastructure for (P38
ships a tb_arch_test.sv signature-dump testbench), but that path is
not yet the default.
§ What P38 actually changes
The earlier framing on this page suggested that “switching to the upstream framework” was a future rung. That was wrong. The accurate picture:
- We have always been on the upstream framework. The DUT plugin moved from a buried directory to a named one in P38.
- One
make arch_testaggregate target now runs all scoped batches together and writes a singleSWEEP.md. Same numbers as P37 (39 + 8 + 6 + 1), reported from one place. - P39 added a fifth batch
(
rv32i/Zicntr, 2 tests at upstream reva7c9930); the recorded aggregate is nowPASS=56 FAIL=0 NOT RUN=0. - The
tb_arch_test.svsignature-dump testbench is wired up but not yet used by the default flow — it is infrastructure for any future rung that wants the canonical signature-diff verification mode (e.g. tests where the in-ELF self-check is not available).
§ What the next credibility step would be
The remaining work to push this from “scoped” to something stronger:
- Widen the test population. Run
rv32i/privilege,rv32i/Zicntr(after P39),rv32i/ExceptionsTraps, and the per-instructionSm-class sub-suites we currently exclude. - Switch verification to the canonical signature-diff mode. Use
tb_arch_test.svto dumpRVMODEL_DATA_BEGIN/_ENDregions and diff against*.reference_output. This catches classes of bugs the in-ELF self-check ignores by design (e.g. sign-extended write of the right value to the wrong address). - Run a published profile. Today we run the
sail-RVI20U32/sail-rv32-maxconfigs; running a named profile likeRVA20U32(when our extensions reach it) is what compliance language actually means.
None of those are part of P38. They are reasons to keep saying scoped ACT4/Sail on the result tables.