Tag

attention

Attention is the core mechanism that lets LLMs route information across tokens, shaping long-context recall, state tracking, and compute cost. This topic covers classic Transformers, KV cache tradeoffs, and newer hybrids that blend attention with state-space or memory modules.

2 articles

Research/May 4

Persistent Visual Memory fixes LVLM visual drift

PVM is a lightweight LVLM module that keeps visual information available during long generations, reducing visual signal decay.

Research/Apr 21

Sessa: Attention and State-Space Memory for Long Context

Sessa mixes attention with recurrent state-space feedback to improve long-context recall, with power-law memory tails and strong benchmark results.