PrijavaRegistracija
LMSYS Org
1,164 posts
user avatar
LMSYS Org
@lmsysorg
Large Model Systems Organization: Join our Slack: slack.sglang.io. We developed SGLang sglang.io, Chatbot Arena (now @arena), and Vicuna!
US
lmsys.org
Datum pridruživanja: August 2024
199
Pratim
15.8K
Osobe koje vas prate

Novi ste na platformi X?

Odmah se registrirajte i dobit ćete vlastitu personaliziranu vremensku crtu!

Stvori račun

Registracijom prihvaćate Uvjete pružanja usluga i Pravila o privatnosti, uključujući Upotrebu kolačića.

Terms·Privacy·Cookies·Pristupačnost·Ads Info·© 2026 X Corp.
Don't miss what's happening
Osobe koje upotrebljavaju X prve saznaju vijesti.
PrijavaRegistracija
  • user avatar
    LMSYS Org
    @lmsysorg
    22. ruj 2025.
    SGLang now supports deterministic LLM inference! Building on @thinkymachines batch-invariant kernels, we integrated deterministic attention & sampling ops into a high-throughput engine - fully compatible with chunked prefill, CUDA graphs, radix cache, and non-greedy sampling. ✅
    112K
  • user avatar
    LMSYS Org
    @lmsysorg
    5. svi 2025.
    🚀 Breaking: SGLang provides the first open-source implementation to serve @deepseek_ai V3/R1 models with large-scale expert parallelism and prefill-decode disaggregation on 96 GPUs. It nearly matches the throughput reported by the official DeepSeek blog, achieving 52.3K input
    162K
  • user avatar
    LMSYS Org
    @lmsysorg
    14. lis 2025.
    🚀 SGLang In-Depth Review of the NVIDIA DGX Spark is LIVE! Thanks to @nvidia’s early access program, SGLang makes its first ever appearance in a consumer product, the brand-new DGX Spark. The DGX Spark’s 128GB Unified Memory and Blackwell architecture set a new standard for
    411K
  • user avatar
    LMSYS Org
    @lmsysorg
    29. ruj 2025.
    🎉 Congrats to the DeepSeek team on the amazing release of Sparse Attention (DSA) in V3.2! This fine-grained design sets a new bar for long-context efficiency 🚀 We’re proud that SGLang is an official inference framework for DeepSeek-V3.2 — with optimized sparse attention
    user avatar
    DeepSeek
    @deepseek_ai
    29. ruj 2025.
    🚀 Introducing DeepSeek-V3.2-Exp — our latest experimental model! ✨ Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context. 👉 Now live on App, Web, and API. 💰 API prices cut by 50%+! 1/n
    56K
  • user avatar
    LMSYS Org
    @lmsysorg
    7. stu 2025.
    🚀 Introducing SGLang Diffusion — bringing SGLang’s high-performance serving to diffusion models. ⚡️ Up to 5.9× faster inference 🧩 Supports major open-source models: Wan, Hunyuan, Qwen-Image, Qwen-Image-Edit, Flux 🧰 Easy to use via OpenAI-compatible API, CLI & Python API
    109K
  • user avatar
    LMSYS Org
    @lmsysorg
    31. svi 2025.
    Hello everyone, the SGLang community, in collaboration with the Search R1 team, has quickly reproduced Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning based on the previously open-sourced multi-turn RL. We welcome you to get hands-on
    18K
  • user avatar
    LMSYS Org
    @lmsysorg
    30. srp 2025.
    🚨Big News! We collaborated with @nvidia to release a DeepSeek R1 inference container optimized for large scale deployment on GB200 NVL72, the world’s most advanced data center–scale accelerated computing platform. This docker container runs a single copy of the model across 56
    29K
  • user avatar
    LMSYS Org
    @lmsysorg
    25. ruj 2025.
    🚀 Follow-up to our last breakthrough on DeepSeek V3/R1 inference! On NVIDIA GB200 NVL72, SGLang now achieves 26k input tokens/s and 13k output tokens/s per GPU with FP8 attention + NVFP4 MoE - that’s a 3.8× / 4.8× speedup vs H100 settings. See the details in the 🧵 (1/4)
    69K
  • user avatar
    LMSYS Org
    @lmsysorg
    26. pro 2024.
    The best open-source LLM, DeepSeek V3, has just been released! SGLang v0.4.1 is the officially recommended inference solution for it. The SGLang and DeepSeek teams worked together to support DeepSeek V3 FP8 on NVIDIA and AMD GPUs from day one. SGLang has supported MLA and DP
    user avatar
    DeepSeek
    @deepseek_ai
    26. pro 2024.
    🚀 Introducing DeepSeek-V3! Biggest leap forward yet: ⚡ 60 tokens/second (3x faster than V2!) 💪 Enhanced capabilities 🛠 API compatibility intact 🌍 Fully open-source models & papers 🐋 1/n
    GIF
    33K
  • user avatar
    LMSYS Org
    @lmsysorg
    14. svi 2025.
    SGLang, verl, OpenBMB and Tsinghua University: Pioneering End-to-End Multi-Turn RLHF We are thrilled to announce the release of the first fully functional, convergence-verified, end-to-end open source multi-turn Reinforcement Learning with Human Feedback (RLHF) framework,
    18K
  • user avatar
    LMSYS Org
    @lmsysorg
    11. kol 2025.
    Honored to see SGLang adopted in RL training for GLM-4.5 at @Zai_org — large-scale validation from a frontier AI lab pushing the boundaries of LLMs!
    86K
  • user avatar
    LMSYS Org
    @lmsysorg
    12. lip 2025.
    Huge thanks to @AMD for donating an MI350 to SGLang! This advanced AI accelerator is making a meaningful difference—enabling us to move faster in developing scalable LLM systems and pushing the limits of inference optimization. Special thank to our awesome infra partner
    44K
  • user avatar
    LMSYS Org
    @lmsysorg
    28. tra 2025.
    Qwen 3 @Alibaba_Qwen has been released! SGLang is proud to be a close partner supporting it from day 0!
    28K
  • user avatar
    LMSYS Org
    @lmsysorg
    11. ruj 2025.
    We are excited to announce SGLang HiCache, our community solution for hierarchical KV caching to power high-performance LLM serving. ⚡ Performance: up to 6× throughput and 80% TTFT reduction demonstrated in benchmarks and real-world deployments. 🗂️ Flexibility: seamless
    60K