Uh oh!

There was an error while loading. Please reload this page.

sgl-project / sglang Public

Notifications You must be signed in to change notification settings
Fork 6.9k
Star 29.9k

Code
Issues 691
Pull requests 3.2k
Discussions
Actions
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security and quality
Insights

Pull requests: sgl-project/sglang

Labels 75 Milestones 1

New pull request New

3,194 Open 20,016 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix kv_b_proj channel scale broadcast when reshape hasn't run yet

#30029 opened Jul 3, 2026 by kewang-amd

Loading…

Warn when Dumper may capture CUDA graph outputs documentation

Improvements or additions to documentation

#30028 opened Jul 3, 2026 by feichai0017

Loading…

Avoid blocking EAGLE grammar mask uploads

#30027 opened Jul 3, 2026 by RunFMe • Draft

Add deterministic inference for eagle parity test

#30026 opened Jul 3, 2026 by ANSHUMAN87 Contributor

Loading…

perf: reorder DSA indexer dual-stream ops to avoid CUDA graph stream explosion

#30025 opened Jul 3, 2026 by kpham-sgl Collaborator

Loading…

3 tasks

perf(sgl-kernel): default block_quota=16 for MLA page_first KV gather… sgl-kernel

#30024 opened Jul 3, 2026 by TianDi101

Loading…

5 tasks

[tracing] sglang tracing v2: support exporting tracing data asynchronously documentation

Improvements or additions to documentation

#30023 opened Jul 3, 2026 by sufeng-buaa Collaborator

Loading…

4 of 5 tasks

fix: serialize FanOutCommunicator queueing calls with a lock

#30022 opened Jul 3, 2026 by lyang24

Loading…

5 tasks

[CI] Add GLM52 NVFP4 MTP B200 tests blackwell

SM100/SM120

#30021 opened Jul 3, 2026 by Fridge003 Collaborator • Draft

[codex] Support CUDA 12.2 source builds blackwell

SM100/SM120

jit-kernel npu quant

LLM Quantization

sgl-kernel

#30020 opened Jul 3, 2026 by BBuf Collaborator • Draft

[MPS] Fix diffusion output stability diffusion

SGLang Diffusion

#30017 opened Jul 3, 2026 by mickqian Collaborator • Draft

[diffusion] feat: performance_mode=speed enables torch.compile by default diffusion

SGLang Diffusion

run-ci

#30016 opened Jul 3, 2026 by mickqian Collaborator

Loading…

For hybrid sliding-window (SWA) models the SWA KV pool is small and quickly

#30013 opened Jul 3, 2026 by TensorGlue-IEIT

Loading…

[DSv4] Use BF16 instead of FP32 for indexer score computation

#30012 opened Jul 3, 2026 by TTThanos Contributor

Loading…

5 tasks

[AMD] WIP - Set REQUEST_TIMEOUT=30 for AMD to deflake multimodal tests amd bypass-fastfail run-ci

#30008 opened Jul 3, 2026 by yctseng0211 Collaborator

Loading…

5 tasks

[CI] increase XPU container shm-size from default 64MB to 8GB run-ci run-ci-extra

#30007 opened Jul 3, 2026 by vshekhawat-hlab Contributor

Loading…

5 tasks

Fix prefill CUDA graph disabled for deeply-nested multimodal models

#30006 opened Jul 3, 2026 by rahulvijayaraghavan Contributor

Loading…

refactor: make time_stats msgpack-native

#30005 opened Jul 3, 2026 by oleksii-tumanov Contributor

Loading…

5 tasks done

[diffusion] feat: per-layer TP shard planner for DiT linears (--dit-tp-plan) diffusion

SGLang Diffusion

#30004 opened Jul 3, 2026 by mickqian Collaborator

Loading…

Experiment: AMD DSV4 CPU affinity and NUMA diagnostics amd

#30003 opened Jul 3, 2026 by bingxche Collaborator • Draft

[MoE] Retire the AOT moe_fused_gate / kimi_k2_moe_fused_gate gate kernels (#26771) jit-kernel mthreads run-ci sgl-kernel

#29997 opened Jul 3, 2026 by BBuf Collaborator

Loading…

3 tasks done

Fix device mismatch when mixing JPEG (GPU-decoded) and other type (CP…

#29996 opened Jul 3, 2026 by yuanshaochen

Loading…

1 of 5 tasks

fix(mimo-vl): pass padded_context_dim to Qwen2_5_VisionPatchMerger

#29994 opened Jul 3, 2026 by alisonshao Collaborator

Loading…

2 of 3 tasks

FlashInfer Backend for MXFP8 Grouped Quantization documentation

Improvements or additions to documentation

quant

LLM Quantization

sgl-kernel

#29992 opened Jul 3, 2026 by philipphack

Loading…

5 tasks done

[docs] Multi-node deployment: add PD disaggregation and Apptainer examples for SLURM documentation

Improvements or additions to documentation

#29991 opened Jul 3, 2026 by davislx

Loading…

3 of 5 tasks

Previous 1 2 3 4 5 … 127 128 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2026-06-03.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!