No. 59 / project of 147 on the ladder

Linux 6.12.85 boots on a chip we built from scratch

introduces — Sv32 instruction-fetch translation; the chip boots an unmodified mainline Linux kernel

harden statelast run2026-05-03
signoff
  • DRCNOT RUN
  • LVSNOT RUN
  • antennaNOT RUN

P59 is the rung where the chip stops being “almost a Linux box” and starts being one. It adds Sv32 instruction-fetch translation — ~50 lines of Verilog that route every instruction fetch through the page-table walker the load/store path was already using. With that in place, a stock Linux 6.12.85 LTS kernel image boots on the chip, prints, parses its device tree, sets up memory zones, initialises SLUB, configures the timer, and reaches Run /init without modification.

Status: RTL pass. Real Linux version banner over UART; kernel runs all the way through architecture init, memory zone setup, clocksource registration, and userspace exec handoff.

The boot trace

This is the actual UART output, captured byte-by-byte from the chip’s testbench. The first eight lines are our M-mode firmware (stage-0); everything from “Linux version” onward is the kernel’s own printk going through SBI v0.1 putchar to the same UART. No edits, no truncation.

P58 stage-0 firmware
  PT_BASE     = 0x00010000
  DTB_BASE    = 0x00100000
  KERNEL_BASE = 0x00400000
  page tables built
  satp        = 0x80000010
  mret to kernel...

Linux version 6.12.85 (jadams@solomon) (riscv64-unknown-linux-gnu-gcc (GCC) 15.2.0, GNU ld (GNU Binutils) 2.46) #8 Sun May  3 18:03:15 CDT 2026
OF: fdt: Ignoring memory range 0x0 - 0x400000
Machine model: P58 Linux test platform
SBI specification v0.1 detected
earlycon: sbi0 at I/O port 0x0 (options '')
printk: legacy bootconsole [sbi0] enabled
printk: debug: skip boot console de-registration.
OF: reserved mem: Reserved memory: No reserved-memory node in the DT
Zone ranges:
  Normal   [mem 0x0000000000400000-0x0000000000ffffff]
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x0000000000400000-0x0000000000ffffff]
Initmem setup node 0 [mem 0x0000000000400000-0x0000000000ffffff]
riscv: base ISA extensions aim
riscv: ELF capabilities aim
pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
pcpu-alloc: [0] 0
Kernel command line: earlycon=sbi console=hvc0 keep_bootcon earlyprintk loglevel=8 panic=-1
Unknown kernel command line parameters "earlyprintk", will be passed to user space.
Dentry cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
Inode-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
Built 1 zonelists, mobility grouping off.  Total pages: 3072
mem auto-init: stack:all(zero), heap alloc:off, heap free:off
SLUB: HWalign=64, Order=0-1, MinObjects=0, CPUs=1, Nodes=1
NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
riscv-intc: 32 local interrupts mapped
clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycles: 0x5c40939b5, max_idle_ns: 440795202646 ns
sched_clock: 64 bits at 25MHz, resolution 40ns, wraps every 4398046511100ns
Console: colour dummy device 80x25
Calibrating delay loop (skipped), value calculated using timer frequency.. 50.00 BogoMIPS (lpj=100000)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
ASID allocator using 9 bits (512 entries)
Memory: 9544K/12288K available (1406K kernel code, 471K rwdata, 172K rodata, 125K init, 176K bss, 2524K reserved, 0K cma-reserved)
devtmpfs: initialized
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
clocksource: Switched to clocksource riscv_clocksource
workingset: timestamp_bits=30 max_order=12 bucket_order=0
printk: legacy console [hvc0] enabled
clk: Disabling unused clocks
Warning: unable to open an initial console.
Freeing unused kernel image (initmem) memory: 124K
Kernel memory protection not selected by kernel config.
Run /init as init process
  with arguments:
    /init
    earlyprintk
  with environment:
    HOME=/
    TERM=linux

What’s actually happening

A short tour of what each section means:

P58 stage-0 firmware … mret to kernel. Our M-mode boot blob runs from physical PA 0x0. It builds an Sv32 page table at 0x10000, identity-maps RAM and MMIO, sets medeleg = 0xb1ff to delegate everything but ECALL to S-mode, sets mstatus.MPP = S, points mepc at the kernel entry, and mrets. It also installs itself as the SBI v0.1 firmware: any subsequent ecall from the kernel lands in our trap handler, which dispatches SBI_CONSOLE_PUTCHAR, SBI_SET_TIMER, SBI_SHUTDOWN, etc.

Linux version 6.12.85 …. Linus’s printk("Linux version "...) in init/version.c. The string flows through early_printkhvc_console_printsbi_console_putchar → an ecall → our M-mode firmware → *MMIO_UART_DATA = c → the UART → the testbench. Every byte you see traversed all of that.

OF: fdt: Ignoring memory range 0x0 - 0x400000. Linux read our hand-built device tree blob at 0x100000, found the memory@0 node describing 16 MiB at PA 0, and noticed it was loaded itself at 0x400000, so it carved off the lower 4 MiB (where stage-0 + DTB live) as reserved.

SBI specification v0.1 detected. The kernel probed our firmware via the standard SBI handshake. We do not implement the v0.2+ Domain extension, so the kernel falls back to v0.1 — exactly what we built.

riscv: base ISA extensions aim. Our DTS advertises rv32ima_zicsr_zifencei. Linux parsed it, kept i, m, a (matching) and skipped c (we don’t have it). The single-letter list says “we have I, M, A, plus M’s MUL/DIV which is in M” - the “aim” includes M which is required so prints once.

Memory: 9544K/12288K available. 12 MiB of RAM (16 MiB minus the 4 MiB reserved for stage-0+DTB). Of that, 1406 KB kernel code, 471 KB rwdata, 172 KB rodata, 125 KB initmem, 176 KB BSS, 2524 KB reserved (page tables, slab metadata). 9544 KB free for userspace.

Switched to clocksource riscv_clocksource. Linux now reads time from the RISC-V time CSR (which our chip implements as a mtime alias inside the CLINT region at 0x02000000 — SiFive convention).

Run /init as init process. End of architecture init. Linux is about to execve() PID 1.

Why this rung needed an RTL change

P58 had everything else: M↔S privilege handoff, full SBI v0.1 dispatch, Sv32 walker for load/store, hardware A/D updates, CLINT timer, identity-mapped page tables. But the kernel still parked inside relocate_enable_mmu — the exact moment it tries to transition from physical-PC to virtual-PC.

The mechanism Linux uses for that transition is deliberate:

  1. Set stvec to a virtual address (the next instruction after the satp swap).
  2. Write satp = trampoline_pg_dir.
  3. Expect an instruction-fetch page fault on the next fetch (because the trampoline doesn’t map the physical PC).
  4. The fault redirects PC to stvec, which is mapped in the trampoline. Now PC is virtual.

Step 3 requires the hardware to walk the page table on instruction fetch. P58 didn’t — it used mem_addr = pc directly, bypassing the MMU for fetches.

P59 adds ~50 lines that route fetches through the same S_PTW1 / S_PTW0 walker the load/store path uses, with three differences: permission check on X (not R/W), faults raise MCAUSE_INSTR_PAGE_FAULT (cause 12), and the A/D update covers only A (D is store-only). On success the translated PA lands in fetch_pa_q and the fetch issues against that address. There’s a single-instruction “TLB” (fetch_xlated_q) that holds the translation across the fetch, then clears.

What it took

The boot was not just an RTL change. The full diff:

Toolchain. Switched from a bare-metal riscv64-elf-gcc toolchain to the multilib Linux toolchain shipped by Nix (pkgsCross.riscv64.buildPackages.gcc), via a flake at the repo root and direnv integration. The bare-metal toolchain refused to build the kernel because its ld was configured without -shared support, which the kernel’s VDSO needs.

Kernel build. Linux 6.12.85 LTS, configured starting from tinyconfig + arch/riscv/configs/32-bit.config, with custom overlays:

  • RISCV_SBI_V01=y, HVC_RISCV_SBI=y, SERIAL_EARLYCON_RISCV_SBI=y (talk to our firmware).
  • RISCV_ISA_C=n (no compressed instructions on the chip).
  • EFI=n (otherwise it selects RISCV_ISA_C back on).
  • PHYS_RAM_BASE_FIXED=y, PHYS_RAM_BASE=0x00400000.

The kernel image is 2.13 MB.

Memory map. The kernel must be loaded at a 4 MiB-aligned physical address (Sv32 setup_vm() enforces BUG_ON((kernel_map.phys_addr % PMD_SIZE) != 0)). Boot blob:

PAContents
0x00000000stage-0 firmware (1 MiB padded)
0x00100000DTB (1 MiB padded)
0x00400000Linux Image (kernel entry point)

Device tree. A 1.1 KB DTS describing 1 hart, RV32IMA + Sv32 MMU, 16 MiB RAM at PA 0, CLINT-shaped timer at 0x02000000, and bootargs = "earlycon=sbi console=hvc0 …".

Where the chip is spending its cycles

P59 is the rung that makes Linux boot, but the profiling infrastructure (PC sampling, walker stats, milestone capture, benchmark.json emit) wasn’t introduced until P61. So this page doesn’t carry its own profile chart — the equivalent data, plus a TLB hit/miss breakdown, lives on the P61 page. The pattern from P61 onward is: every project rung renders its own state/walker/hot-function SVGs, and a cross-project comparison chart accumulates as we add P62-P64.

What’s next

  • P60: userspace. A statically-linked RV32 hello binary (no libc — direct ecall to SYS_write/SYS_exit), packaged as initramfs cpio, embedded in the kernel as /init. See 60_userspace_hello.
  • P61: TLB. A 4-entry, fully-associative TLB in front of the walker. Profiling P60 showed the walker dominated post-relocation kernel cycles; the TLB cuts that down. See 61_tlb.
  • P62 → P63 → P64: pipelining. 2-stage → 3-stage → classic 5-stage MIPS-style pipeline with forwarding. Each rung will produce its own chart pack (state breakdown, hot functions, walker activity) plus a cross-project comparison showing CPI / cycles-to-milestone progression.

Files

  • src/top.sv — chip RTL with ifetch translation
  • app/main.c — stage-0 firmware (M-mode + SBI v0.1 dispatcher)
  • runtime/start.S, runtime/link.ld — stage-0 entry
  • boot/p58_chip.dts, boot/p58_chip.dtb — device tree
  • test/Makefile — boot-blob assembly + sim runner
  • test/tb_freertos_demo.sv — testbench (preload mode bypasses UART loader for multi-MB images; UART buffer 64 KiB)

Harden

NOT RUN. The added state and walker reuse should be near-zero fmax impact, but it hasn’t been measured. Will quantify when we do the post-pipelining synthesis pass.

What just happened?

A 32-bit RISC-V Linux kernel runs on a chip that started as a git init six days ago. It has no FPU, no caches, a 1-entry “TLB” that lives for one instruction, and was just taught to translate instruction fetches in 50 lines of Verilog. And it boots an unmodified mainline kernel image.

That’s the fun thing about RISC-V. The privileged spec is small, the boot protocol is small, the SBI ABI is small. If you implement each piece honestly, the software stack — written by other people, for hundreds of other RISC-V chips — just runs.