Skip to content
Navigation Menu
Toggle navigation
Sign in
Appearance settings
Platform
AI CODE CREATION
GitHub Copilot
Write better code with AI
GitHub Copilot app
Direct agents from issue to merge
MCP Registry
New
Integrate external tools
DEVELOPER WORKFLOWS
Actions
Automate any workflow
Codespaces
Instant dev environments
Issues
Plan and track work
Code Review
Manage code changes
APPLICATION SECURITY
GitHub Advanced Security
Find and fix vulnerabilities
Code security
Secure your code as you build
Secret protection
Stop leaks before they start
EXPLORE
Why GitHub
Documentation
Blog
Changelog
Marketplace
View all features
Solutions
BY COMPANY SIZE
Enterprises
Small and medium teams
Startups
Nonprofits
BY USE CASE
App Modernization
DevSecOps
DevOps
CI/CD
View all use cases
BY INDUSTRY
Healthcare
Financial services
Manufacturing
Government
View all industries
View all solutions
Resources
EXPLORE BY TOPIC
AI
Software Development
DevOps
Security
View all topics
EXPLORE BY TYPE
Customer stories
Events & webinars
Ebooks & reports
Business insights
GitHub Skills
SUPPORT & SERVICES
Documentation
Customer support
Community forum
Trust center
Partners
View all resources
Open Source
COMMUNITY
GitHub Sponsors
Fund open source developers
PROGRAMS
Security Lab
Maintainer Community
Accelerator
GitHub Stars
Archive Program
REPOSITORIES
Topics
Trending
Collections
Enterprise
ENTERPRISE SOLUTIONS
Enterprise platform
AI-powered developer platform
AVAILABLE ADD-ONS
GitHub Advanced Security
Enterprise-grade security features
Copilot for Business
Enterprise-grade AI features
Premium Support
Enterprise-grade 24/7 support
Pricing
Search or jump to...
Search code, repositories, users, issues, pull requests...
Search syntax tips
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Sign in
Sign up
Appearance settings
Resetting focus
You signed in with another tab or window.
Reload
to refresh your session.
You signed out in another tab or window.
Reload
to refresh your session.
You switched accounts on another tab or window.
Reload
to refresh your session.
Dismiss alert
{{ message }}
Uh oh!
There was an error while loading.
Please reload this page
.
This repository was archived by the owner on May 13, 2026. It is now read-only.
vectorch-ai
/
ScaleLLM
Public archive
Notifications
You must be signed in to change notification settings
Fork
41
Star
499
Code
Issues
48
Pull requests
8
Discussions
Actions
Projects
Wiki
Security and quality
0
Insights
Additional navigation options
Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security and quality
Insights
Issues
Assigned to me
Created by me
Mentioned
Recent activity
Views
Projects
Milestones
Labels
ScaleLLM Roadmap
#84 ·
guocuimi
opened
on Mar 16, 2024
3
ScaleAttention: a custom CUDA kernel, optimized for inference.
#356 ·
guocuimi
opened
on Jan 1, 2025
Issues
Search Issues
is
:
issue
state
:
open
is:issue state:open
Search
Labels
Milestones
Search results
Open
Closed
kv cache: integrate nixl for kv cache transfer
Status: Open.
#457
In vectorch-ai/ScaleLLM;
·
guocuimi
opened
on May 1, 2025
kv cache: cache aware router
Status: Open.
#456
In vectorch-ai/ScaleLLM;
·
guocuimi
opened
on May 1, 2025
kv cache: fp8/int8 kv cache support
Status: Open.
#455
In vectorch-ai/ScaleLLM;
·
guocuimi
opened
on May 1, 2025
model: add qwen new models
Status: Open.
#454
In vectorch-ai/ScaleLLM;
·
guocuimi
opened
on May 1, 2025
model: add deepseek new models
Status: Open.
#453
In vectorch-ai/ScaleLLM;
·
guocuimi
opened
on May 1, 2025
kernel: tile scheduling
Status: Open.
#452
In vectorch-ai/ScaleLLM;
·
guocuimi
opened
on May 1, 2025
kernel: attention kernel for sm_75/70
Status: Open.
#451
In vectorch-ai/ScaleLLM;
·
guocuimi
opened
on May 1, 2025
feat: disaggregate Prefill and Decoding
Status: Open.
#450
In vectorch-ai/ScaleLLM;
·
guocuimi
opened
on May 1, 2025
kernel: deepep integration
Status: Open.
#449
In vectorch-ai/ScaleLLM;
·
guocuimi
opened
on May 1, 2025
kernel: grouped gemm for MOE quantization
Status: Open.
#448
In vectorch-ai/ScaleLLM;
·
guocuimi
opened
on May 1, 2025
kernel: attention kernel for sm_120
Status: Open.
#447
In vectorch-ai/ScaleLLM;
·
guocuimi
opened
on May 1, 2025
kernel: attention kernel for sm_100/101
Status: Open.
#446
In vectorch-ai/ScaleLLM;
·
guocuimi
opened
on May 1, 2025
You can’t perform that action at this time.