No. 140 / project of 147 on the ladder

Repair-aware I-cache arbitration

introduces — one-word I-cache background repair budget; repair policy stop counters; repair bandwidth tuning

harden statelast run2026-05-06
signoff
  • DRCNOT RUN
  • LVSNOT RUN
  • antennaNOT RUN

P140 turns P139’s audit into a policy experiment. Every foreground I-cache fill may start background repair, but the descriptor gets only one adjacent-word budget before it stops.

checkresult
Verilator buildPASS
BusyBox shell workload reaches P140-FILE-OKPASS
Aux-load queue full dropsPASS
Aux response errors/cancelsPASS
Hardened layoutNOT RUN
metricP134P139P140
post-load cycles218,247,567220,650,157220,083,644
shell window cycles64,221,64265,708,76465,035,481
retired instructions86,139,76086,400,64986,226,717
CPI2.53362.55382.5524
S_FETCH cycles7,617,6957,633,2297,623,290
S_MEM cycles27,799,34329,450,02129,353,267

P140 is a previous-rung speed PASS: 673,283 cycles faster than P139. It is still a speed FAIL versus P134 and P138.

repair-usefulness counterP139P140
background repair word fills52,847,19132,047,006
first later fetch hits1,926,2191,319,114
repeat later fetch hits829,627552,003
first-hit usefulness ratio3.64%4.12%
first + repeat fetch-hit ratio5.22%5.84%
P140 policy bucketcount
repair starts58,754,125
budget stops32,047,006
already-valid word skips149,554
budget at end0

The one-word budget cuts repair fills by 20.80M, about 39.4%, and the usefulness ratio rises. That is the good part.

The bad part is absolute instruction locality. I-cache hits fall from 43.0M to 29.1M and miss refills rise from 43.2M to 48.8M. P140 stops wasting as much repair bandwidth, but it also starves useful line fill.

The next rung should keep the value-aware direction but loosen the fixed budget: grant a second repair word only when the line has evidence of fetch reuse or the frontend is instruction-starved.

shell phases label P140 shell workload cycles 220,083,644 cpi 2.55
  1. kernel banner to /init 117,336,084 53.5%
  2. /init to shell banner 1,096,714 0.5%
  3. shell banner to first command 35,986,532 16.4%
  4. echo command 1,649 0%
  5. uname -a 2,448,396 1.1%
  6. ls /bin /usr/share 31,867,764 14.5%
  7. cat sample file 2,881,587 1.3%
  8. touch/write/cat/rm /tmp file 11,472,669 5.2%
  9. 8x ash loop with file I/O 16,362,736 7.5%
  10. final marker 680 0%
state breakdown label P140 repair-aware I-cache arbitration workload cycles 220,083,644 cpi 2.55
  1. fetch 3.5% 7,623,290
  2. execute 39.2% 86,251,519
  3. mem 13.5% 29,632,483
  4. walker 1.2% 2,683,038
  5. writeback 39.2% 86,226,717
  6. mul/div 3.5% 7,664,881