Tag

generalization

Generalization is the ability of a model to keep working on unseen data, shifted distributions, or longer reasoning paths. Here it connects training stability, Hessian-spectrum sharpness, and LLM failures on new maps or longer sequence lengths.

1 articles

Research/Apr 22

Generalization at the Edge of Stability: 1 Paper on Why

A new paper links chaotic, high-learning-rate training to generalization via a “sharpness dimension” built from the Hessian spectrum.