Tag

MoE

MoE, or Mixture of Experts, is an architecture that activates only a subset of experts per token or task, balancing total parameter count, inference cost, and quality. It shows up in open coding models, long-context agents, and other systems built for efficient scaling.

1 articles

Research/May 8

UniPool shares MoE experts across layers

UniPool replaces per-layer MoE experts with one shared pool, cutting redundancy and improving validation loss in five LLaMA-scale models.