No. 143 / project of 147 on the ladder

Prefetch consumer repair classifier

introduces — per-class I-cache repair attribution; prefetch consumer usefulness audit; P144 execute-prefetch throttle target

harden statelast run2026-05-06
signoff
  • DRCNOT RUN
  • LVSNOT RUN
  • antennaNOT RUN

P143 keeps P142’s repair policy and adds attribution. Every background-repaired I-cache word now records which foreground fill class created it, and the harness reports first/repeat later fetch hits per class.

checkresult
Verilator buildPASS
BusyBox shell workload reaches P143-FILE-OKPASS
Classifier totals match global countersPASS
Aux-load queue full dropsPASS
Aux response errors/cancelsPASS
Hardened layoutNOT RUN
metricP142P143
post-load cycles218,856,863219,787,362
shell window cycles63,926,69164,756,820
retired instructions85,799,04586,097,554
CPI2.55082.5528
S_FETCH cycles7,598,1337,607,357
S_MEM cycles29,199,83029,346,505

P143 is an audit PASS, not a speed PASS. The shell window is 830,129 cycles slower than P142, but the new counters answer the policy question.

classrepair fillsfirst hitsrepeat hitsfirst ratiofirst+repeat ratio
demand fetch913,159428,227279,29546.90%77.48%
execute prefetch39,099,312811,814293,9442.08%2.83%
load prefetch2,481,880131,69213,9855.31%5.87%
writeback prefetch390,678166,95452,18942.73%56.09%
steer prefetch0000.00%0.00%
aux prefetch109,32336,40124,16133.30%55.40%
unknown0000.00%0.00%

The answer is blunt: execute-prefetch repair creates 39.10M of 42.99M repair fills and has the weakest payback. Demand, writeback-prefetch, and aux-prefetch repairs have high ratios; load-prefetch repair is much smaller and modestly useful.

P144 should stop giving execute-prefetch fills a second repair word by default, while keeping the higher-payback classes.

shell phases label P143 shell workload cycles 219,787,362 cpi 2.55
  1. kernel banner to /init 117,353,435 53.6%
  2. /init to shell banner 1,091,269 0.5%
  3. shell banner to first command 35,956,986 16.4%
  4. echo command 1,649 0%
  5. uname -a 2,609,410 1.2%
  6. ls /bin /usr/share 31,580,524 14.4%
  7. cat sample file 3,054,146 1.4%
  8. touch/write/cat/rm /tmp file 10,989,474 5%
  9. 8x ash loop with file I/O 16,520,937 7.5%
  10. final marker 680 0%
state breakdown label P143 prefetch consumer repair classifier workload cycles 219,787,362 cpi 2.55
  1. fetch 3.5% 7,607,357
  2. execute 39.2% 86,122,222
  3. mem 13.5% 29,625,509
  4. walker 1.2% 2,681,536
  5. writeback 39.2% 86,097,554
  6. mul/div 3.5% 7,651,468