No. 70 / project of 147 on the ladder

F/D FPU subset, libc revival, and AtomVM ping-pong

introduces — hard-float newlib compatibility; F/D load-store and fmv bit-motion paths; small D-FPU subset for printf/dtoa; libc init in reset code; AtomVM stdlib and process ping-pong on real newlib

harden statelast run2026-05-05
signoff
  • DRCNOT RUN
  • LVSNOT RUN
  • antennaNOT RUN

P68 proved AtomVM could run on the chip, but it also exposed a toolchain wall: the available riscv32-none-elf newlib is built as rv32imafdc/ilp32d. That means ordinary library code may contain compressed instructions and hard-float ABI register spills even when the application never uses floating point.

P68 solved that with workarounds: direct UART output instead of real stdio, and a local bump allocator instead of newlib malloc. P70 takes the more useful route: make the chip tolerate the library shape the toolchain actually emits.

Headline: real newlib puts, printf, snprintf, malloc, free, realloc, and %f formatting now run on the chip in Verilator. The same P70 runtime also runs AtomVM with bundled lib.avm, io:format/2, and a 32-round-trip Erlang process ping-pong demo.

P70 phase 3+4 libc/FPU revival
  printf-int: 1 22 333
  printf-hex: 0xdeadbeef
  printf-str: hello from real libc
  printf-double: 3.125
  snprintf into malloc'd buf at 0xe548
  malloc(16)=0xe548 malloc(128)=0xe560 malloc(1024)=0xe5e8
  realloc round-trip: abcdefgXYZ

P70 phase 3+4 PASS - newlib stdio + malloc + FPU work

What changed in RTL

P70 now has a pragmatic, small floating-point unit. The first half is the bit-motion subset needed by hard-float ABI libraries:

  • FMV.W.X and FMV.X.W
  • FLW / FSW
  • FLD / FSD
  • C.FLW, C.FSW, C.FLD, C.FSD
  • C.FLWSP, C.FSWSP, C.FLDSP, C.FSDSP

FLD and FSD move through two 32-bit memory transactions using S_FLD_HI and S_FSD_HI. FLW NaN-boxes the upper 32 bits.

The second half is the double-precision subset that full newlib’s printf/snprintf formatting path actually reaches:

  • FADD.D, FSUB.D, FMUL.D, FDIV.D
  • FMADD.D, FMSUB.D, FNMSUB.D, FNMADD.D
  • FCVT.D.W, FCVT.D.WU, FCVT.W.D, FCVT.WU.D
  • FEQ.D, FLT.D, FLE.D, FCLASS.D

It is not a full IEEE-754 implementation. Subnormals are flushed to zero, exception flags are not implemented, and the rest of F/D still needs a real compliance pass before we call it architectural.

What changed in runtime

The reset path now does the normal bare-metal libc ceremony:

  • set gp from __global_pointer$
  • clear .bss
  • install a diagnostic trap handler
  • call __libc_init_array
  • call main
  • call __libc_fini_array if main returns

The linker script keeps .preinit_array, .init_array, and .fini_array, and the syscall shim lets newlib’s allocator use _sbrk. The old P68 malloc/free/calloc/realloc overrides are gone for this smoke path.

AtomVM on the fixed base

The final P70 step is deliberately not a new VM feature. It is a regression check against the problems P68 had to dodge. P70 builds a separate atomvm_main.c entry point that uses real puts, printf, setvbuf, malloc, and free, then links libAtomVM with floating point enabled against the same hard-float newlib shape.

The boot blob still uses the P68 memory layout: stage-0 code at zero, lib.avm at 8 MiB, and the application main.avm at 12 MiB. The host packbeam tool and the host-built atomvmlib.avm come from the pinned P68 AtomVM checkout, while the chip image is P70-specific. That library pack includes init.beam and the Erlang stdlib modules, so the run gets past the old hello-only wall.

Captured result:

AtomVM on P70 (real newlib + FPU subset)
Starting AtomVM revision 0.8.0-dev+git.5e47d64
Found startup beam: pingpong.beam
pingpong starting
pingpong done: 32 round trips
Return value: ok
AtomVM exited result=0

That is the stronger PASS: AtomVM loads the bundled stdlib pack, runs pingpong.beam, spawns a process, performs 32 message round trips, prints with io:format/2, returns ok, and halts PASS.

Verification

cd projects/70_f_libc/test
make -f Makefile.smoke clean all
make -f Makefile.smoke run
make all
make verilator-run

Captured result:

[harness] run ended after 60579 post-load cycles
[harness] halted=1, halt_code=0x00000001

AtomVM harness result:

[harness] === run ended after 9193286 post-load cycles ===
[harness] halted=1, halt_code=0x00000001

Build warning: the linker reports one RWX load segment. That is the same bare-metal linker-script warning seen on earlier rungs: text, data, and bss all live in one RAM segment and there is no MMU permission split in this smoke binary.

What is not proven

RISC-V F/D compliance tests were NOT RUN. This is not a claim of F or D architectural compliance. Unsupported: sqrt, sign injection, min/max, most single-precision arithmetic, exception flags, CSR floating-point state, subnormal precision, and the rest of the architectural edge cases.

The useful claim is narrower and much more practical: the hard-float newlib we already have can now run reachable stdio, malloc, %f formatting, and a bundled-stdlib AtomVM process demo on this chip without tripping over FP register spills, missing libc initialization, or the D arithmetic in _dtoa_r.

What just happened?

The chip stopped fighting the toolchain. Instead of replacing libc piece by piece, P70 added the bit-motion paths, the small D-FPU subset, and runtime initialization that let ordinary newlib code work. Then the P68 AtomVM smoke was rerun on top of that base, with the libc workarounds removed and lib.avm bundled, to prove the VM gets past the old blockers and runs real Erlang process traffic.