P134 is the pivot back to frontend/memory work. P133 proved the backend
dispatch module boundary was coherent, but it was still one-deep and
shadow-only. P134 reopens the aux-load path and only lets it fire when
the main memory port can also prefetch the next instruction word.
check
result
Verilator build
PASS
BusyBox shell workload reaches P134-FILE-OK
PASS
Aux-load queue full drops
PASS
Aux response errors/cancels
PASS
Hardened layout
NOT RUN
metric
P133
P134
post-load cycles
218,556,365
218,247,567
shell window cycles
64,581,917
64,221,642
retired instructions
86,293,679
86,139,760
CPI
2.5327
2.5336
S_FETCH cycles
7,626,643
7,617,695
S_MEM cycles
27,727,937
27,799,343
aux-load counter
P133
P134
candidates
5,410,388
5,372,186
issues
0
139,881
queue enqueues
0
139,881
queue dequeues
0
139,881
queue full drops
0
0
aux load fills
0
139,881
P134 block reason
count
frontend prefetch not safe/useful
1,587,474
D-cache background fill active
2,817,459
I-cache background fill active
4,862,648
This is a modest speedup: 360,275 fewer shell-window cycles than P133,
about 0.56%. The more important result is that the aux-load mechanism is
active again without returning to the too-eager P111/P112 behavior.
The next memory-side question is now concrete: most remaining blocked
aux-load candidates are blocked by I-cache or D-cache background-fill
activity. P135 should measure whether those background policies are too
strict, too loose, or just correctly protecting instruction service.