Tag
multimodal reasoning
2 articles

Research/May 4
Persistent Visual Memory fixes LVLM visual drift
PVM is a lightweight LVLM module that keeps visual information available during long generations, reducing visual signal decay.

Research/Apr 2
HippoCamp tests agents on your personal files
HippoCamp benchmarks multimodal agents on dense personal file systems, exposing weak retrieval, grounding, and cross-modal reasoning.