Tag
1 articles
TurboQuant is the right direction for llama.cpp because asymmetric KV compression cuts memory without breaking compatibility.