journal 2026-05-03

P44 - the chip renders an animation

p44freertosframebufferdemo

You asked: “if the SPI display demo runs at chip-pixel-rate, and the PC-window version also runs at chip-pixel-rate, isn’t the PC version exactly as fast?”

Right. The cheapest-and-coolest first version isn’t a real LCD at all. It’s a memory-mapped framebuffer, simulator-side dump, pygame window. P44 lands that.

The bridge

Three small things on top of P43:

  1. RTL: one new MMIO register (MMIO_FRAME_READY at 0x10001ffc), same shape as P43’s halt port. Software writes any value to mark a frame done; the chip pulses frame_strobe for one cycle.
  2. Testbench: on every frame_strobe, peek into the external memory model at byte address 0x00030000 and write 18432 bytes (96×96 RGB565) to frames/frame_NNNN.bin.
  3. Software: a FreeRTOS render task computes the plasma into the framebuffer region, writes the frame number to MMIO_FRAME_READY, repeat. After 4 frames write HALT_PASS to MMIO_HALT.

The chip itself doesn’t know there’s a “screen.” It just writes pixels to memory and ticks a counter.

The plasma

Cheap demoscene formula, no math library:

v = tri_wave(x*2 + t*3) + tri_wave(y*2 + t*4) + tri_wave((x+y) + t*2);
r = 128 + (v >> 1);
g = 128 - (v >> 2);
b = (x ^ y ^ t);
fb[y][x] = rgb565(r, g, b);

tri_wave is an 8-bit triangle approximation of |sin|. The whole thing is bit-shifts and adds - fits the multi-cycle FSM’s compute budget at ~13 FPS for 96×96.

The viewer

app/viewer.py is a small pygame app that decodes RGB565 little- endian frames, scales 4× (so 384×384 window), plays back at 2 FPS in a loop. pip install pygame is the one external dependency.

Result

FRAME: P44 dumped seq=0 to frames/frame_0000.bin (18432 bytes)
FRAME: P44 dumped seq=1 to frames/frame_0001.bin (18432 bytes)
FRAME: P44 dumped seq=2 to frames/frame_0002.bin (18432 bytes)
FRAME: P44 dumped seq=3 to frames/frame_0003.bin (18432 bytes)
HALT: P44 cleanly halted at 7526005 clocks (frames=4)
PASS: P44 framebuffer demo complete.

7.5M cycles total = ~1.88M cycles per frame. Sanity-check: 9216 pixels × ~10 instructions per pixel × ~10 CPI = 922K cycles minimum, plus the MMIO_FRAME_READY write and the FreeRTOS context-switch overhead. ~1.88M is right in the ballpark.

The frames look right. Frame 0 and frame 3 have visibly different plasma patterns - the time variable is doing what it’s supposed to do.

Why this matters past being pretty

P43 closed the FreeRTOS arc but the result was a printout - “5.1M clocks, halt_code=1.” That’s correct, that’s honest, but it’s not the thing you point at and go “look what the chip does.”

P44 produces a window. The chip is visibly running. Every pixel on the screen came out of an instruction the multi-cycle FSM core executed. Every frame is the result of a FreeRTOS task running on top of the trap frame and timer interrupt we built in P40-P43. The same chip we hardened to a real sky130A GDS in P43.

If we ever fab this thing on a shuttle, the same software runs and the same frames come out - just to a real LCD via a real SPI peripheral. P44 is the simulator-flavoured version of that.

What’s next

Real SPI peripheral as a future rung, then either drive an ST7789 directly (chip-on-screen) or tap the SPI to a USB bridge for chip-on-PC at full speed. Or just keep going on the Linux climb - the framebuffer demo is reusable when we eventually have a kernel that knows about a graphics driver.