P118 names the load/store unit we do not really have yet. The RTL still
uses the old execute plus S_MEM path, but the counters now split that
path into address generation, DTLB service, D-cache/store-buffer service,
and lower-memory completion.
check
result
Verilator build
PASS
BusyBox shell workload reaches P118-FILE-OK
PASS
LSU-shape counters emitted
PASS
Hardened layout
NOT RUN
LSU counter
value
address-generation events
27,899,292
load address-generation events
15,536,292
store address-generation events
12,161,761
AMO address-generation events
201,239
DTLB hits
27,223,125
DTLB misses
671,678
S_MEM LSU cycles
27,758,933
S_MEM D-cache-hit cycles
4,603,337
S_MEM store-buffer accepts
1,156,649
S_MEM store-buffer waits
0
S_MEM aux-load cycles
0
metric
P117
P118
shell window cycles
64,329,783
64,957,904
S_FETCH cycles
7,615,856
7,628,220
S_MEM cycles
27,670,716
27,758,933
P118 is a baseline, not a speed rung. It says the next data-side work
should separate request records from commit effects before trying rename,
ROB, or true out-of-order issue.