In this paper, We show that the angle concentration of hidden‑state vectors is an intrinsic indicator of how much an LLM can learn from a sample, tightly correlating with gradient strength. Leveraging ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results