P91 is the cleanup after the P90 negative result.
P90 made a 4-word I-cache line, but it blocked in S_IC_FILL before
executing the missed word. That produced more cache hits and worse shell
time. P91 keeps the line cache but changes the policy: execute the
critical word immediately and fill the rest of the line later from a
one-entry background descriptor.
The BusyBox shell profile passed:
| metric | P89 | P90 | P91 |
|---|---|---|---|
| post-load cycles | 222,317,206 | 245,417,593 | 221,327,811 |
| shell window cycles | 66,957,620 | 84,084,195 | 65,985,297 |
| fetch stall cycles | 54,266,192 | 55,950,261 | 55,533,555 |
| I-cache hits | 6,434,333 | 12,352,686 | 8,855,599 |
So P91 is a measured win over P90 and a small shell-window win over P89. It is not a clean fetch-stall win over P89. The fill descriptor was active for 195,614,189 cycles but only completed 846,917 background fills. That says the next rung should be a fetch queue.
Honest status: RTL simulation PASS, BusyBox shell profile PASS, LibreLane hardening NOT RUN.