No. 138 / project of 147 on the ladder

Debt memory arbitration

introduces — I-cache repair debt counter; debt-limited aux-load preemption; debt paydown instrumentation

harden statelast run2026-05-06
signoff
  • DRCNOT RUN
  • LVSNOT RUN
  • antennaNOT RUN

P138 replaces P137’s one-preempt/one-defer burst cap with explicit I-cache repair debt. Preempting active I-cache background fill adds debt. A real I-cache background fill response pays debt down. While debt is at the limit, the otherwise-safe aux-load preemption defers.

checkresult
Verilator buildPASS
BusyBox shell workload reaches P138-FILE-OKPASS
Aux-load queue full dropsPASS
Aux response errors/cancelsPASS
Hardened layoutNOT RUN
metricP134P137P138
post-load cycles218,247,567220,276,307219,659,138
shell window cycles64,221,64265,356,84164,637,761
retired instructions86,139,76086,275,64986,082,907
CPI2.53362.55322.5517
S_FETCH cycles7,617,6957,627,2947,617,168
S_MEM cycles27,799,34329,389,37529,317,924

P138 is a previous-rung speed PASS: 719,080 cycles faster than P137 and 774,178 cycles faster than P135. It is still 416,119 cycles slower than P134.

aux-load counterP134P137P138
candidates5,372,1865,126,3865,226,972
issues139,8811,674,0171,675,134
queue enqueues139,8811,674,0171,675,134
queue dequeues139,8811,674,0171,675,134
queue full drops000
aux load fills139,8811,674,0171,675,134
P138 policy bucketcount
frontend prefetch not safe/useful1,577,186
frontend-ready candidates3,649,786
background quiet candidates / issued171,151
issued while preempting I-cache background1,503,983
deferred by I-cache debt limit67,597
I-cache debt paydowns1,503,983
I-cache debt saturations0
debt at end0
blocked by D-cache background only299,981
blocked by both backgrounds1,607,074

The debt policy is better than the burst sketch, but not enough to beat the conservative P134 gate. That points the next rung at cost attribution: which protected or interrupted I-cache background fills are actually useful to fetch soon afterward?

shell phases label P138 shell workload cycles 219,659,138 cpi 2.55
  1. kernel banner to /init 117,349,518 53.6%
  2. /init to shell banner 1,079,598 0.5%
  3. shell banner to first command 35,963,409 16.4%
  4. echo command 1,649 0%
  5. uname -a 2,451,847 1.1%
  6. ls /bin /usr/share 31,973,711 14.6%
  7. cat sample file 3,224,177 1.5%
  8. touch/write/cat/rm /tmp file 10,586,426 4.8%
  9. 8x ash loop with file I/O 16,399,271 7.5%
  10. final marker 680 0%
state breakdown label P138 debt memory arbitration workload cycles 219,659,138 cpi 2.55
  1. fetch 3.5% 7,617,168
  2. execute 39.2% 86,107,701
  3. mem 13.5% 29,596,848
  4. walker 1.2% 2,673,421
  5. writeback 39.2% 86,082,907
  6. mul/div 3.5% 7,579,377