No. 113 / project of 147 on the ladder

Load miss policy

introduces — aux-load issue gating; policy counters; queue-pressure measurement

harden statelast run2026-05-06
signoff
  • DRCNOT RUN
  • LVSNOT RUN
  • antennaNOT RUN

P113 keeps the P112 queue but gates aux-load issue behind frontend and D-cache pressure checks.

Result

checkresult
Verilator buildPASS
BusyBox shell workload reaches P113-FILE-OKPASS
Aux-load policy countersPASS
Speedup against P112PASS
Speedup against P110FAIL
Hardened layoutNOT RUN
metricP112 queued load auxP113 load policy
post-load cycles227,966,087218,189,372
shell window cycles71,950,54264,307,797
retired instructions88,125,23286,158,953
CPI2.58682.5324
S_MEM cycles32,192,99327,674,926

Policy

countervalue
aux-load candidates5,394,095
aux-load issues0
blocked: no next-PC fetch queue hit5,394,095
blocked: D-cache background active2,843,612

The policy recovered P112’s regression by blocking the queue, but it is too strict to be the final answer.

memory stalls label P113 load-policy workload stalls 58,328,963 handshakes 64,113,801
  1. instruction fetch 28,212,291 48.4% 45,152,834 req
  2. data load 10,350,842 17.7% 560,950 req
  3. data store 10,899,784 18.7% 76,768 req
  4. atomic memory op 174,156 0.3% 165,725 req
  5. page walk for fetch 677,509 1.2% 671,355 req
  6. page walk for load/store 678,011 1.2% 671,827 req
  7. other 7,336,370 12.6% 16,814,342 req
shell phases label P113 shell workload cycles 218,189,372 cpi 2.53
  1. kernel banner to /init 116,719,451 53.7%
  2. /init to shell banner 1,068,355 0.5%
  3. shell banner to first command 35,465,702 16.3%
  4. echo command 1,649 0%
  5. uname -a 2,403,211 1.1%
  6. ls /bin /usr/share 31,786,095 14.6%
  7. cat sample file 2,860,644 1.3%
  8. touch/write/cat/rm /tmp file 11,258,176 5.2%
  9. 8x ash loop with file I/O 15,997,342 7.4%
  10. final marker 680 0%
state breakdown label P113 load-policy workload cycles 218,189,372 cpi 2.53
  1. fetch 3.5% 7,617,467
  2. execute 39.5% 86,183,647
  3. mem 12.8% 27,953,768
  4. walker 1.2% 2,698,702
  5. writeback 39.5% 86,158,953
  6. mul/div 3.5% 7,575,119
hot functions label P113 BusyBox shell symbols samples 62,800 period every 1,024 cycles
  1. printf_core busybox
    5.6% 3,537
  2. memset kernel
    5.1% 3,212
  3. memcpy busybox
    3.7% 2,350
  4. vruntime_eligible kernel
    3.3% 2,084
  5. blake2s_compress_generic kernel
    2.9% 1,814
  6. __fwritex busybox
    2.7% 1,707
  7. memcpy kernel
    2.7% 1,698
  8. handle_exception kernel
    1.7% 1,090
  9. unmap_page_range kernel
    1.6% 1,018
  10. n_tty_write kernel
    1.3% 830
  11. memset busybox
    1.3% 808
  12. avg_vruntime kernel
    1.3% 786
  13. ret_from_exception kernel
    1.3% 786
  14. next_uptodate_folio kernel
    1.1% 668
  15. do_trap_ecall_u kernel
    1% 613
  16. (remaining) remaining
    55.3% 34,719