P42 is the first rung where real RTOS scheduling happens on our chip. Same RTL as P39 + P40 + P41 - this rung is application + testbench + debugging.
Status: RTL pass. FreeRTOS V11.1.0 unmodified, three tasks + idle, expected UART sequence
S a b c d e f g h Dobserved at 5,100,726 clocks. Two real bugs found and documented.
The demo
Three application tasks plus FreeRTOS’s idle task:
| task | priority | does |
|---|---|---|
| watcher | 3 | vTaskDelay(200 ticks), then writes 'D' |
| producer | 2 | drops 'a'..'h' into a queue every 5 ticks |
| consumer | 1 | pulls from queue, writes each byte to UART |
| idle | 0 | runs vApplicationIdleHook() (one nop) |
main() writes 'S' before calling vTaskStartScheduler(). Expected
UART output:
S a b c d e f g h D
Observed at 5.1M clocks. PASS.
What this validates end-to-end
The result above proves a chain that has been growing across the last five rungs:
| rung | layer added | proven by P42 |
|---|---|---|
| P37 | mscratch + fence.i | (CSR ops in tasks.c) |
| P38 | DUT plugin packaging | (no regression) |
| P39 | Zicntr counters | (used by FreeRTOS run-time stats path) |
| P40 | full trap frame | xPortStartFirstTask + mret-into-task work |
| P40 | irq_save/irq_restore | matches FreeRTOS’s portmacro expectations |
| P41 | FreeRTOS port glue | links + sets up the timer interrupt path |
| P42 | scheduler runs | tasks switch, queues block, ticks fire correctly |
Two real bugs surfaced
P42’s bring-up exposed two non-trivial bugs that the smaller rungs never touched.
Testbench: iverilog hangs on unpacked 256 KiB byte arrays
The first cut of tb_freertos_demo.sv had logic [7:0] mem [0:262143] with always @* blocks reading multi-byte values out of
it. iverilog’s elaborator hung at 99% CPU for 11+ minutes trying to
compute sensitivity lists.
Fix: reuse the P17-style packed-word memory module (logic [31:0] mem [0:WORDS-1]) which iverilog handles without blowup.
gcc j . epilogue collides with our halt sentinel
This is the interesting one. Three otherwise-correct things combine into a surprise:
- Our DUT halts on
0x0000006f(jal x0, 0). Convention from the directed runtimes - they end with that instruction. - gcc -O2 emits
j .at the end offor (;;)functions where the loop body has no reachable side effects. It’s a safety net for “noreturn”. - FreeRTOS’s
portYIELD()is__asm volatile ("ecall")with no"memory"clobber, so gcc treats memory loads as loop-invariant acrosstaskYIELD().
Together: gcc compiles prvIdleTask so it loads the ready-list
count once, branches into yield if > 1, otherwise falls out of the
loop body into j .. With our priority layout the count drops to
1 immediately and we fall through to j ., which our DUT halts on.
The fix in P42 is software-side: configUSE_IDLE_HOOK = 1 plus a
trivial vApplicationIdleHook(void) { __asm__ volatile("nop"); }.
The function call in the for-loop body prevents the collapse-to-j .
pattern. The chip stays running until the watcher’s deliberate halt.
The right long-term fix is a harder halt sentinel in the DUT (e.g.
ebreak + a magic CSR write) so gcc’s j . doesn’t trip it. That
is a future rung; P42 stays software-only and documents the
collision.
Harden result
NOT RUN. RTL is verbatim P39 except for mimpid. Hardening would
produce a result essentially identical to P39’s; the next rung (P43
per the roadmap) is the one that takes this same image
through a real harden and produces a “FreeRTOS on hardware” GDS.
What just happened?
We have a real RTOS demo running on a chip we hardened ourselves.
The infrastructure work in P40 + P41 paid off here - none of the
trap-frame, IRQ, or port adapter pieces needed changes. What did
need fixing turned out to be the boundary between gcc’s optimizer
and our DUT’s halt convention. That’s a useful real-world
discovery: chip halt sentinels need to be encodings the toolchain
doesn’t naturally generate, and jal x0, 0 doesn’t qualify.
The next ladder rung is P43: harden the same image and produce GDS. After that, FreeRTOS on this chip stops being a simulation claim and becomes a hardware claim.