No. 117 / project of 147 on the ladder

Speculative target buffer

introduces — guarded one-entry speculative target buffer; userspace fault result; promote/discard counters

harden statelast run2026-05-06
signoff
  • DRCNOT RUN
  • LVSNOT RUN
  • antennaNOT RUN

P117 adds the one-entry speculative target-buffer state that P116 said was missing, then deliberately leaves issue guarded off in the passing RTL. The live buffer-on run was useful but not correct: Linux reached /init, then BusyBox faulted on a store to badaddr=0x00000000.

checkresult
Verilator buildPASS
BusyBox shell workload reaches P117-FILE-OK with issue guarded offPASS
Live speculative target-buffer issueFAIL
Buffer counters visible in JSONPASS
Hardened layoutNOT RUN
countervalue
userspace steering candidates215,013
steering issues0
buffer fills0
buffer promotes0
buffer discards0
blocked unaligned119,771
blocked TLB/permission62
FTQ fills54,059,898
FTQ consumes54,059,898
FTQ flushes0
metricP116P117
shell window cycles65,806,66364,329,783
S_FETCH cycles7,643,1957,615,856
S_MEM cycles27,889,07127,670,716

This is not a speedup claim for speculation. The P117 passing result is guarded, so the small shell-window improvement is normal run-to-run and initramfs-label movement, not evidence that the target buffer helped. The real result is the bug boundary: speculative target data cannot use the current prefetch/I-cache fill side effects until promotion and discard are isolated from architectural instruction delivery.

shell phases label P117 shell workload cycles 218,191,245 cpi 2.53
  1. kernel banner to /init 116,717,853 53.7%
  2. /init to shell banner 1,070,638 0.5%
  3. shell banner to first command 35,444,904 16.3%
  4. echo command 1,649 0%
  5. uname -a 2,390,063 1.1%
  6. ls /bin /usr/share 31,829,610 14.6%
  7. cat sample file 2,664,614 1.2%
  8. touch/write/cat/rm /tmp file 11,463,819 5.3%
  9. 8x ash loop with file I/O 15,979,348 7.3%
  10. final marker 680 0%
state breakdown label P117 guarded target-buffer workload cycles 218,191,245 cpi 2.53
  1. fetch 3.5% 7,615,856
  2. execute 39.5% 86,186,071
  3. mem 12.8% 27,949,094
  4. walker 1.2% 2,675,010
  5. writeback 39.5% 86,161,477
  6. mul/div 3.5% 7,602,021