P147 is the final narrow execute-prefetch repair check promised by P146. It makes one composite predicate active:
predicted_not_taken && word_not_last && quiet_backend
When the guard fires, execute-prefetch repair gets a two-word budget. Otherwise it keeps the P144 one-word budget.
| check | result |
|---|---|
| Verilator build | PASS |
| BusyBox userspace build | PASS |
| Linux image rebuilt with P147 initramfs | PASS |
BusyBox shell workload reaches P147-FILE-OK | PASS |
| Aux-load queue full drops | PASS |
| Aux response errors/cancels | PASS |
| Hardened layout | NOT RUN |
| metric | P142 | P144 | P146 | P147 |
|---|---|---|---|---|
| post-load cycles | 218,856,863 | 219,161,535 | 220,667,218 | 219,782,307 |
| shell window cycles | 63,926,691 | 64,192,833 | 65,663,039 | 64,812,561 |
| retired instructions | 85,799,045 | 85,905,855 | 86,410,376 | 86,122,202 |
| CPI | 2.5508 | 2.5512 | 2.5537 | 2.5520 |
| S_FETCH cycles | 7,598,133 | 7,601,896 | 7,642,687 | 7,620,860 |
| S_MEM cycles | 29,199,830 | 29,215,063 | 29,435,054 | 29,306,905 |
P147 improves on P146 by 850,478 cycles, but still loses to P144 by 619,728 cycles and to P142 by 885,870 cycles. That makes it an RTL PASS and a speed FAIL.
| predicate | opportunities | fills | first hits | repeat hits | first+repeat / fill |
|---|---|---|---|---|---|
seq_adjacent | 63,419 | 27,079 | 0 | 0 | 0.00% |
word_not_last | 22,111,120 | 35,052,260 | 758,121 | 259,095 | 2.90% |
uncompressed_not_last | 21,322,755 | 33,848,167 | 225,827 | 48,353 | 0.81% |
predicted_not_taken | 20,165,355 | 33,108,902 | 737,761 | 251,896 | 2.99% |
quiet_backend | 24,924,183 | 32,829,051 | 636,269 | 219,427 | 2.61% |
strict_composite | 16,516,160 | 29,570,243 | 620,389 | 216,422 | 2.83% |
The active guard still spends too much bandwidth for too little fetch reuse. Against P144 it adds 8.09M background repair fills and 14.64M prefetch second-word grants, but only 191,040 extra first/repeat hits.
| class | repair fills | first hits | repeat hits | first+repeat ratio |
|---|---|---|---|---|
| demand fetch | 931,741 | 438,523 | 295,535 | 78.79% |
| execute prefetch | 37,205,331 | 759,054 | 259,109 | 2.74% |
| load prefetch | 2,423,233 | 132,406 | 13,083 | 6.00% |
| writeback prefetch | 396,342 | 169,072 | 52,954 | 56.02% |
| aux prefetch | 109,673 | 36,639 | 24,194 | 55.47% |
This closes the execute-prefetch second-word repair thread for now. The next architecture rung should pivot to a different frontend/memory bottleneck.
- kernel banner to /init 117,338,268 53.5%
- /init to shell banner 1,086,082 0.5%
- shell banner to first command 35,916,577 16.4%
- echo command 1,649 0%
- uname -a 2,457,747 1.1%
- ls /bin /usr/share 32,296,246 14.7%
- cat sample file 3,052,234 1.4%
- touch/write/cat/rm /tmp file 10,694,437 4.9%
- 8x ash loop with file I/O 16,309,568 7.4%
- final marker 680 0%
- fetch 3.5% 7,620,860
- execute 39.2% 86,147,006
- mem 13.5% 29,585,503
- walker 1.2% 2,675,861
- writeback 39.2% 86,122,202
- mul/div 3.5% 7,629,159