Tag
attention
Attention is the core mechanism that lets LLMs route information across tokens, shaping long-context recall, state tracking, and compute cost. This topic covers classic Transformers, KV cache tradeoffs, and newer hybrids that blend attention with state-space or memory modules.
2 articles

Research/May 4
Persistent Visual Memory fixes LVLM visual drift
PVM is a lightweight LVLM module that keeps visual information available during long generations, reducing visual signal decay.

Research/Apr 21
Sessa: Attention and State-Space Memory for Long Context
Sessa mixes attention with recurrent state-space feedback to improve long-context recall, with power-law memory tails and strong benchmark results.