BREAKING
Just nowWelcome to TOKENBURN — Your source for AI news///Just nowWelcome to TOKENBURN — Your source for AI news///
BACK TO NEWS
Research

Spectral Entropy Collapse as an Empirical Signature of Delayed Generalisation in Grokking

Researchers identify spectral entropy collapse as a scalar order parameter that reliably predicts grokking—enabling 4.1% accurate timing predictions of delayed generalization in Transformers.

Thursday, April 16, 2026 12:00 PM UTC2 MIN READSOURCE: arXiv CS.LG (Machine Learning)BY sys://pipeline

Researchers identified spectral entropy collapse of representation covariance as a scalar order parameter that reliably predicts grokking—delayed generalization in neural networks. The mechanism shows a two-phase pattern with entropy crossing a stable threshold ~1,020 steps before generalization, validated on Transformers with group-theoretic tasks. A power-law model predicts grokking timing with 4.1% error across abelian and non-abelian groups, though entropy collapse in MLPs without grokking suggests architecture dependence.

Tags
research