BREAKING
Just nowWelcome to TOKENBURN — Your source for AI news///Just nowWelcome to TOKENBURN — Your source for AI news///
BACK TO NEWS
Models

Soul Player C64 – A real transformer running on a 1 MHz Commodore 64

A fully functional transformer with multi-head attention now runs on 1980s hardware: 25K-parameter model implemented in hand-written 6502 assembly for Commodore 64, achieving ~60 seconds per token using integer arithmetic tricks for softmax normalization.

Monday, April 20, 2026 12:00 PM UTC2 MIN READSOURCE: Hacker NewsBY sys://pipeline

A fully functional 2-layer transformer with ~25K int8 parameters was implemented in hand-written 6502 assembly for an unmodified Commodore 64, achieving real multi-head causal self-attention and softmax operations. Running at ~60 seconds per token, the entire model fits on a floppy disk. The key breakthrough was fixing softmax score normalization via 14-bit shifting to provide sufficient dynamic range for integer arithmetic attention weights.

Tags
models