P138 replaces P137’s
one-preempt/one-defer burst cap with explicit I-cache repair debt.
Preempting active I-cache background fill adds debt. A real I-cache
background fill response pays debt down. While debt is at the limit, the
otherwise-safe aux-load preemption defers.
check
result
Verilator build
PASS
BusyBox shell workload reaches P138-FILE-OK
PASS
Aux-load queue full drops
PASS
Aux response errors/cancels
PASS
Hardened layout
NOT RUN
metric
P134
P137
P138
post-load cycles
218,247,567
220,276,307
219,659,138
shell window cycles
64,221,642
65,356,841
64,637,761
retired instructions
86,139,760
86,275,649
86,082,907
CPI
2.5336
2.5532
2.5517
S_FETCH cycles
7,617,695
7,627,294
7,617,168
S_MEM cycles
27,799,343
29,389,375
29,317,924
P138 is a previous-rung speed PASS: 719,080 cycles faster than P137 and
774,178 cycles faster than P135. It is still 416,119 cycles slower than
P134.
aux-load counter
P134
P137
P138
candidates
5,372,186
5,126,386
5,226,972
issues
139,881
1,674,017
1,675,134
queue enqueues
139,881
1,674,017
1,675,134
queue dequeues
139,881
1,674,017
1,675,134
queue full drops
0
0
0
aux load fills
139,881
1,674,017
1,675,134
P138 policy bucket
count
frontend prefetch not safe/useful
1,577,186
frontend-ready candidates
3,649,786
background quiet candidates / issued
171,151
issued while preempting I-cache background
1,503,983
deferred by I-cache debt limit
67,597
I-cache debt paydowns
1,503,983
I-cache debt saturations
0
debt at end
0
blocked by D-cache background only
299,981
blocked by both backgrounds
1,607,074
The debt policy is better than the burst sketch, but not enough to beat
the conservative P134 gate.
That points the next rung at cost attribution: which protected or
interrupted I-cache background fills are actually useful to fetch soon
afterward?