Tag
1 articles
This compares raw GGUF Q4_K kernels and prepacked weight caches for V100 decode inference.