Post

Log inSign up

Post

user avatar
Anyscale
@anyscalecompute
We’ve recently contributed FP8 support to the @vllm_project in collaboration with @neuralmagic. With this feature, you can see up to a 1.8x reduction in inter-token latency, with >99% accuracy preservation! 1/n
3:25 PM · Jul 10, 202434.3KViews

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Relevant people

user avatar
Anyscale@anyscalecomputeFollow

Trending now

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up