No. 137 / project of 147 on the ladder

Bounded memory arbitration

introduces — one-preempt I-cache arbitration bound; aux-load preemption deferral counter; tiny shell-window recovery from P136

harden statelast run2026-05-06
signoff
  • DRCNOT RUN
  • LVSNOT RUN
  • antennaNOT RUN

P137 keeps P136’s I-cache-background preemption mechanism, but bounds it. After one otherwise-safe aux-load issue preempts I-cache background fill, the next otherwise-safe I-cache-only candidate is deferred so instruction-line repair gets a chance to compete again.

checkresult
Verilator buildPASS
BusyBox shell workload reaches P137-FILE-OKPASS
Aux-load queue full dropsPASS
Aux response errors/cancelsPASS
Hardened layoutNOT RUN
metricP135P136P137
post-load cycles219,445,401222,462,201220,276,307
shell window cycles65,411,93967,392,71865,356,841
retired instructions86,534,50386,961,84186,275,649
CPI2.53592.55822.5532
S_FETCH cycles7,634,6417,646,7857,627,294
S_MEM cycles27,975,70329,771,84129,389,375

This is a previous-rung shell-window PASS, but barely: P137 is 2,035,877 cycles faster than P136 and 55,098 cycles faster than P135. It is still slower than P134.

aux-load counterP135P136P137
candidates5,418,0285,182,6005,126,386
issues140,5401,773,9341,674,017
queue enqueues140,5401,773,9341,674,017
queue dequeues140,5401,773,9341,674,017
queue full drops000
aux load fills140,5401,773,9341,674,017
P137 policy bucketcount
frontend prefetch not safe/useful1,584,002
frontend-ready candidates3,542,384
background quiet candidates / issued171,158
issued while preempting I-cache background1,502,859
deferred by I-cache preemption bound72,887
blocked by D-cache background only354,195
blocked by both backgrounds1,441,285

The bound is real but weak. It defers 72,887 candidates and still lets 1.50M I-cache-background preemptions through. The right next move is a debt or age arbiter: preemptions should accumulate debt, I-cache background service should pay it down, and aux-load issue should stop while instruction repair is too far behind.

shell phases label P137 shell workload cycles 220,276,307 cpi 2.55
  1. kernel banner to /init 117,320,222 53.4%
  2. /init to shell banner 1,095,830 0.5%
  3. shell banner to first command 35,874,580 16.3%
  4. echo command 1,649 0%
  5. uname -a 2,251,245 1%
  6. ls /bin /usr/share 32,015,316 14.6%
  7. cat sample file 2,913,207 1.3%
  8. touch/write/cat/rm /tmp file 11,917,408 5.4%
  9. 8x ash loop with file I/O 16,257,336 7.4%
  10. final marker 680 0%
state breakdown label P137 bounded memory arbitration workload cycles 220,276,307 cpi 2.55
  1. fetch 3.5% 7,627,294
  2. execute 39.2% 86,300,521
  3. mem 13.5% 29,668,737
  4. walker 1.2% 2,689,373
  5. writeback 39.2% 86,275,649
  6. mul/div 3.5% 7,713,017