journal 2026-05-05

P92: fetch queue

P92 added a one-entry fetch queue to the Linux-capable core. The queue is deliberately conservative: safe word-aligned next_pc fetches can launch from S_EXECUTE, then S_WB consumes the queued instruction if the PC still matches.

The shell smoke passed:

P92 direct UART console + memory attribution smoke PASS

The result was mixed. Queue counters prove the mechanism is active:

countervalue
queue valid cycles53,982,463
queue fills53,982,463
queue consumes53,982,463
execute-prefetch cycles53,982,463

But the shell window is slower than P91:

metricP91P92
post-load cycles221,327,811222,624,131
shell window cycles65,985,29767,206,635
fetch stall cycles55,533,55523,555,005
I-cache hits8,855,59942,665,352

So the queue is not the wrong idea, but this one-entry implementation is not enough. It cuts fetch-class stalls while leaving the machine too tangled around one memory path and no prediction.

Next: P93 branch predictor v0.