Energy-Based Transformers: The 1982 Architecture Finally Got Compatible Training Tricks
An EBM finally crossed 800M parameters without collapsing. Nobody has independently reproduced the 35% scaling claim. Both halves matter.
machine-learningai-architecture
May 07, 2026
·machine-learningai-architecture