Phase Transitions in Large Language Models and the O(N) Model
发布时间:2025年04月14日
浏览次数:183
发布者: Ruixin Li
主讲人: Youran Sun (Tsinghua University)
活动时间: 从 2025-04-15 14:00 到 15:00
场地: 北京国际数学研究中心,全斋全29教室
Abstract: Large language models (LLMs) exhibit unprecedentedly rich scaling behaviors. In physics, scaling behavior is closely related to phase transitions, critical phenomena, and field theory. To investigate the phase transition phenomena in LLMs, we reformulated the Transformer architecture as an O(N) model. Our study reveals two distinct phase transitions corresponding to the temperature used in text generation and the model's parameter size, respectively. The first phase transition enables us to estimate the internal dimension of the model, while the second phase transition is of higher-depth and signals the emergence of new capabilities. As an application, the energy of the O(N) model can be used to evaluate whether an LLM's parameters are sufficient to learn the training data.