Tag
expert routing
2 articles

Research/May 8
UniPool shares MoE experts across layers
UniPool replaces per-layer MoE experts with one shared pool, cutting redundancy and improving validation loss in five LLaMA-scale models.

Research/Apr 10
Why multimodal MoE models get distracted
A study of multimodal MoE models finds visual inputs can derail routing to reasoning experts, and a routing-guided fix improves results.