Tag
1 articles
HippoCamp benchmarks multimodal agents on dense personal file systems, exposing weak retrieval, grounding, and cross-modal reasoning.