LMSYS Org on X: "🚨Big News! We collaborated with @nvidia to release a DeepSeek R1 inference container optimized for large scale deployment on GB200 NVL72, the world’s most advanced data center–scale accelerated computing platform. This docker container runs a single copy of the model across 56 https://t.co/KnsoocDfox" / X

🚨Big News! We collaborated with @nvidia to release a DeepSeek R1 inference container optimized for large scale deployment on GB200 NVL72, the world’s most advanced data center–scale accelerated computing platform. This docker container runs a single copy of the model across 56 Blackwell GPUs, achieving over 13,149 tokens/sec for prefill and 9,290 tokens/sec for decode. These results represent a 2x and 3x per GPU increase, respectively, compared to running the model on an H100 cluster nearly twice as large. Read our blog to learn more and download the container to reproduce the results!👇

1:03 AM · Jul 30, 202528.7KViews

Post

Post