Tag
QJL
2 articles

Research/May 6
Why TurboQuant changes the KV cache debate
TurboQuant makes KV cache compression a theoretical win, not just an engineering trick.

Research/Apr 3
Google's TurboQuant Cuts LLM Memory Costs
Google says TurboQuant uses QJL and PolarQuant to shrink vector-quantization memory and speed up LLM inference by up to 8x.