P92: fetch queue · journal

P92 added a one-entry fetch queue to the Linux-capable core. The queue is deliberately conservative: safe word-aligned next_pc fetches can launch from S_EXECUTE, then S_WB consumes the queued instruction if the PC still matches.

The shell smoke passed:

P92 direct UART console + memory attribution smoke PASS

The result was mixed. Queue counters prove the mechanism is active:

counter	value
queue valid cycles	53,982,463
queue fills	53,982,463
queue consumes	53,982,463
execute-prefetch cycles	53,982,463

But the shell window is slower than P91:

metric	P91	P92
post-load cycles	221,327,811	222,624,131
shell window cycles	65,985,297	67,206,635
fetch stall cycles	55,533,555	23,555,005
I-cache hits	8,855,599	42,665,352

So the queue is not the wrong idea, but this one-entry implementation is not enough. It cuts fetch-class stalls while leaving the machine too tangled around one memory path and no prediction.

Next: P93 branch predictor v0.