No. 106 / project of 147 on the ladder

Banked lower-memory contract

introduces — top-level auxiliary lower-bank read lane; Verilator-serviced banked memory contract; checksum-backed auxiliary read accounting

harden statelast run2026-05-06
signoff
  • DRCNOT RUN
  • LVSNOT RUN
  • antennaNOT RUN

P106 is the first real widened boundary in the lower-memory banking arc. P105 modeled safe different-bank extra grants. P106 emits those requests on a top-level auxiliary read lane and has the Verilator memory model service them against the real boot image.

The core does not consume the auxiliary response yet, so this is not a speed rung.

Result

checkresult
make check-toolsPASS
Verilator buildPASS
Linux reaches /initPASS
BusyBox promptPASS
BusyBox shell workload reaches P106-FILE-OKPASS
Auxiliary lower-bank read lane servicedPASS
Auxiliary read errorsPASS
Hardened layoutNOT RUN

Timing

metricP105 modelP106 contract
post-load cycles218,480,625219,613,584
shell window cycles64,438,09665,558,077
retired instructions86,106,73186,478,207
CPI2.53732.5395
BusyBox ready milestone118,422,909118,413,096
shell FILE-OK milestone218,480,768219,613,727
kernel panic milestone00

The timing table is included for honesty. Since the core still ignores the auxiliary response, P106 cannot reduce architectural cycles yet.

Auxiliary Contract

The new top-level lane carries:

banked_aux_valid, banked_aux_addr, banked_aux_size,
banked_aux_bank, banked_aux_side

It fires for the same conservative condition P105 modeled: split-bank I/D demand, one real grant, and a blocked read-like request.

Service Counters

countervalue
auxiliary instruction reads serviced488,792
auxiliary data reads serviced20,141,261
auxiliary reads serviced total20,630,053
shell-window auxiliary reads8,458,681
auxiliary read errors0
auxiliary read checksum947,922,106

The serviced read count matches the model exactly:

comparisonvalue
modeled extra grants20,630,053
serviced auxiliary reads20,630,053
match100.00%
modeled shell extra grants8,458,681
serviced shell auxiliary reads8,458,681
shell match100.00%

Per-bank auxiliary service:

bankauxiliary reads serviced
08,716,171
14,270,912
23,843,696
33,799,274

Memory Stalls

memory stalls label P106 banked lower-contract workload stalls 58,912,691 handshakes 65,996,622
  1. instruction fetch 27,458,054 46.6% 46,933,482 req
  2. data load 11,664,621 19.8% 560,189 req
  3. data store 10,933,918 18.6% 77,398 req
  4. atomic memory op 174,290 0.3% 167,624 req
  5. page walk for fetch 684,866 1.2% 678,712 req
  6. page walk for load/store 671,603 1.1% 665,427 req
  7. other 7,325,339 12.4% 16,913,790 req

The main stall chart still reflects the single response consumed by the core. The new result lives in banked_lower_contract.

Shell Phases

shell phases label P106 shell workload cycles 219,613,584 cpi 2.54
  1. kernel banner to /init 116,715,634 53.3%
  2. /init to shell banner 1,069,252 0.5%
  3. shell banner to first command 35,642,554 16.3%
  4. echo command 1,649 0%
  5. uname -a 2,558,009 1.2%
  6. ls /bin /usr/share 32,115,978 14.7%
  7. cat sample file 2,713,112 1.2%
  8. touch/write/cat/rm /tmp file 10,261,206 4.7%
  9. 8x ash loop with file I/O 17,907,443 8.2%
  10. final marker 680 0%

The shell script reaches P106-FILE-OK.

Cycle Shape

state breakdown label P106 banked lower-contract workload cycles 219,613,584 cpi 2.54
  1. fetch 3.7% 8,131,172
  2. execute 39.4% 86,503,337
  3. mem 12.8% 28,069,359
  4. walker 1.2% 2,700,608
  5. writeback 39.4% 86,478,207
  6. mul/div 3.5% 7,729,185

P106 retires 86.48M instructions at CPI 2.5395.

Hot Functions

hot functions label P106 BusyBox shell symbols samples 64,022 period every 1,024 cycles
  1. printf_core busybox
    5.7% 3,676
  2. memset kernel
    5.2% 3,308
  3. memcpy busybox
    3.6% 2,288
  4. vruntime_eligible kernel
    3.4% 2,195
  5. blake2s_compress_generic kernel
    2.8% 1,796
  6. __fwritex busybox
    2.7% 1,723
  7. memcpy kernel
    2.5% 1,620
  8. handle_exception kernel
    1.8% 1,148
  9. unmap_page_range kernel
    1.6% 1,001
  10. avg_vruntime kernel
    1.4% 872
  11. n_tty_write kernel
    1.4% 867
  12. memset busybox
    1.3% 825
  13. ret_from_exception kernel
    1.3% 804
  14. n_tty_read kernel
    1.1% 681
  15. next_uptodate_folio kernel
    1% 669
  16. (remaining) remaining
    55.1% 35,264

The software workload is unchanged. This page is about the hardware boundary and serviced auxiliary-memory work.

Next

P107 should consume the auxiliary response for one narrow, low-risk client first. Background D-cache fill or prefetch traffic is a better first target than demand fetch/load, because a wrong demand response path can corrupt architectural execution immediately.