Tag
GPU inference
2 articles

Industry News/Apr 3
NVIDIA B300 vs H200: Specs and DeepSeek Perf
B300 packs 288GB HBM3e and up to 8TB/s bandwidth. Here’s how it compares with H200 for DeepSeek inference and cloud costs.

Tools & Apps/Apr 3
TurboQuant, Fast Cold Starts, and Rust on GPUs
TurboQuant cuts KV cache use 4.6x, GPU state restoration slashes cold starts, and Rust is moving deeper into CUDA work.