Log inSign up
Robert Nishihara
1,782 posts
user avatar
Robert Nishihara
@robertnishihara
Co-founder @anyscalecompute. Co-creator of @raydistributed. Previously PhD ML at Berkeley.
robertnishihara.com
Joined March 2009
851
Following
16.9K
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • Pinned
    user avatar
    Robert Nishihara
    @robertnishihara
    Jun 15, 2025
    Beyond pre-training, here's how I imagine most learning will work. 1. AI models / systems will maintain large collections of retrievable knowledge. This will include facts like "the capital of California is Sacramento" and tactics like "when playing Monopoly, buy a bunch of
    This post is unavailable.
    163K
  • user avatar
    Robert Nishihara
    @robertnishihara
    Aug 11, 2025
    I don't think it's very widely known how big of a role @istoica05 and the research community at UC Berkeley have played in shaping technology through research and open source. For just a few recent examples of foundational open source technologies, Ion and his students created
    user avatar
    Forbes
    @Forbes
    Aug 10, 2025
    This computer science professor became a billionaire launching four startups out of his privately-funded research lab, including unicorns Databricks and Anyscale. But it’s never been just about business. (Photo: Timothy Archibald for Forbes) trib.al/xyBDRVN
    114K
  • user avatar
    Robert Nishihara
    @robertnishihara
    Jan 26, 2025
    Just sat down to read the DeepSeek-R1 paper. We're entering an era where compute isn't primarily for training. It's for creating better data. I expect to see the money & compute spent on data processing (generation / annotation / curation) grow to match and exceed the money &
    174K
  • user avatar
    Robert Nishihara
    @robertnishihara
    Oct 17, 2025
    The thing that is exciting *to me* about @AnthropicAI's "agent skills" announcement is that it provides a step toward continual learning. - Rather than continuously updating model weights, agents interacting with the world can continuously add new skills. - Compute spent on
    user avatar
    Robert Nishihara
    @robertnishihara
    Jun 15, 2025
    Beyond pre-training, here's how I imagine most learning will work. 1. AI models / systems will maintain large collections of retrievable knowledge. This will include facts like "the capital of California is Sacramento" and tactics like "when playing Monopoly, buy a bunch of
    219K
  • user avatar
    Robert Nishihara
    @robertnishihara
    Jan 25, 2023
    I remember in 2016 when @ApacheSpark set the record for sorting 100TB in the most cost-efficient way ($144 in 2016, $115 in today's prices). Today, @raydistributed broke the $1 / TB barrier and set the world record at $97! 🔥📈🥳🎂🎗️🥂
    Ray Is the World’s Most Cost-Efficient Sorting System at $1/TB
    From anyscale.com
    76K
  • user avatar
    Robert Nishihara
    @robertnishihara
    Oct 23, 2025
    I'm hiring for a new engineering role working directly with me to support our most sophisticated customers. Looking for someone who wants to work across the AI / AI infra stack, write / debug a ton of code, work directly with customers, move / learn super fast. DM me.
    39K
  • user avatar
    Robert Nishihara
    @robertnishihara
    Dec 12, 2023
    Function calls have been a massive gap in the open source ecosystem (and the most common feature request). We benchmarked function calling on a variety of open and proprietary models. Impressively, Mistral-7B performs on par with GPT-3.5. Here's how they stack up 🤯🤯 ⚫️
    user avatar
    Anyscale
    @anyscalecompute
    Dec 12, 2023
    We're announcing new features and models today. 🔵 JSON mode ⚫️ function calling Try them out with our API. anyscale.com/blog/anyscale-…
    216K
  • user avatar
    Robert Nishihara
    @robertnishihara
    Oct 15, 2025
    If you're curious why LLM inference is different from regular model inference (and why we've seen so much investment in specialized LLM inference engines like vLLM, SGLang, and TensorRT-LLM), I had a lot of fun talking through some of the main ideas here.
    user avatar
    Linda Vivah (Haviv)
    @lindavivah
    Oct 15, 2025
    Walk with @robertnishihara & I in NYC with 10% charge 🪫 as we talk through 5 key differences between 𝗟𝗟𝗠 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗩𝗦 𝗥𝗲𝗴𝘂𝗹𝗮𝗿 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 Let’s see how much we can get through before our mic dies! 🤣
    00:00
    56K
  • user avatar
    Robert Nishihara
    @robertnishihara
    Aug 11, 2023
    This in-depth case study sheds light on when you can achieve GPT-4 level performance with a fine-tuned 7B parameter model. Take SQL generation as an example. Accuracy 🧿 Llama-2-7B: 3% 🧿 GPT-4: 79% 🧿 Llama-2-7B (fine-tuned): 86% Out of the box, GPT-4 crushes Llama-2
    user avatar
    kourosh hakhamaneshi
    @CyrusHakha
    Aug 11, 2023
    🚀 Exploring Llama-2’s Quality: Can we replace generalist GPT-4 endpoints with specialized OSS models? Dive deep with our technical blogpost to understand the nuances and insights of fine-tuning OSS models. 🔗anyscale.com/blog/fine-tuni… 🧵 Thread 1/N👇
    178K
  • user avatar
    Robert Nishihara
    @robertnishihara
    Dec 18, 2022
    While everyone's talking about training giant models, companies like @Instacart are quietly achieving 10x performance improvements by training and deploying many smaller models. Here's how they're doing it. 🔥🔥
    Training 1 Million ML Models in Record Time | Anyscale
    From anyscale.com
    65K
  • user avatar
    Robert Nishihara
    @robertnishihara
    May 17, 2023
    We've built a ton of #LLM applications recently. Reasoning about performance & feasibility is painful without reference points. Here are the reference points we use to anchor our intuition (inspired by @JeffDean's "Numbers every engineer should know"). github.com/ray-project/ll…
    81K
  • user avatar
    Robert Nishihara
    @robertnishihara
    Oct 12, 2023
    An important systems bottleneck when working with LLMs is model loading times, but if you get the details right, you can speed up standard implementations by around 20x (over 10 minutes down to around 35 seconds for Llama-2-70B). There are a few bottlenecks to numbers to think
    user avatar
    Cade Daniel 🇺🇸
    Engram
    @cdnamz
    Oct 11, 2023
    How long does it take to download Llama2 70B? On the 4x 25 Gbps NICs that aws.p4de's have, it should take ~10s. Yet in production we've observed much higher times, which makes autoscaling less responsive + more expensive. This blog post shows how we've reduced download & init
    172K
  • user avatar
    Robert Nishihara
    @robertnishihara
    Dec 15, 2023
    Faster Mixtral? Much more to come here. We make deep investments in open source AI. If you'd like to help build open source AI or optimize LLM performance, join us at @anyscalecompute. DM me 🚢
    user avatar
    Woosuk Kwon
    @woosuk_k
    Dec 14, 2023
    We've just released v0.2.5 which includes this performance improvement (contributed by Antoni at @anyscalecompute). Please try it out!
    70K
  • user avatar
    Robert Nishihara
    @robertnishihara
    Jan 20, 2023
    One of our goals with @raydistributed has been to provide a great off-the-shelf experience for beginners as well as the performance and flexibility required by power users. @OpenAI is on the "power users" end of the spectrum.
    How Ray, a Distributed AI Framework, Helps Power ChatGPT
    From thenewstack.io
    51K