Tag
1 articles
Gemma 4 E2B and E4B assistant models use centroid masking to cut lm_head work about 45x with little quality loss.