P137’s one-preempt/one-defer rule was a sketch. P138 replaces it with a real I-cache repair debt counter. Preemption adds debt; actual I-cache background fill service pays it down.
The shell workload passes. The aux-load queue handles 1,675,134 load responses with 0 drops, 0 errors, and 0 cancels.
The result is better than P137: shell window drops from 65,356,841 to 64,637,761 cycles. It also beats P135 by 774,178 cycles.
But P134 is still faster at 64,221,642 cycles. The debt arbiter is a better policy than the burst cap, but the conservative quiet-background gate still wins this workload. Next we need cost attribution for I-cache repair: when we protect or interrupt a background fill, did that line actually help fetch soon afterward?