Robert Nishihara (@robertnishihara) / X

Robert Nishihara

1,782 posts

Robert Nishihara

@robertnishihara

Co-founder @anyscalecompute. Co-creator of @raydistributed. Previously PhD ML at Berkeley.

Joined March 2009

Pinned
Robert Nishihara
@robertnishihara
Jun 15, 2025
Beyond pre-training, here's how I imagine most learning will work. 1. AI models / systems will maintain large collections of retrievable knowledge. This will include facts like "the capital of California is Sacramento" and tactics like "when playing Monopoly, buy a bunch of
This post is unavailable.
163K
Robert Nishihara
@robertnishihara
Aug 11, 2025
I don't think it's very widely known how big of a role @istoica05 and the research community at UC Berkeley have played in shaping technology through research and open source. For just a few recent examples of foundational open source technologies, Ion and his students created
Forbes
@Forbes
Aug 10, 2025
This computer science professor became a billionaire launching four startups out of his privately-funded research lab, including unicorns Databricks and Anyscale. But it’s never been just about business. (Photo: Timothy Archibald for Forbes) trib.al/xyBDRVN
114K
Robert Nishihara
@robertnishihara
Jan 26, 2025
Just sat down to read the DeepSeek-R1 paper. We're entering an era where compute isn't primarily for training. It's for creating better data. I expect to see the money & compute spent on data processing (generation / annotation / curation) grow to match and exceed the money &
174K
Robert Nishihara
@robertnishihara
Oct 17, 2025
The thing that is exciting *to me* about @AnthropicAI's "agent skills" announcement is that it provides a step toward continual learning. - Rather than continuously updating model weights, agents interacting with the world can continuously add new skills. - Compute spent on
Robert Nishihara
@robertnishihara
Jun 15, 2025
Beyond pre-training, here's how I imagine most learning will work. 1. AI models / systems will maintain large collections of retrievable knowledge. This will include facts like "the capital of California is Sacramento" and tactics like "when playing Monopoly, buy a bunch of
219K
Robert Nishihara
@robertnishihara
Jan 25, 2023
I remember in 2016 when @ApacheSpark set the record for sorting 100TB in the most cost-efficient way ($144 in 2016, $115 in today's prices). Today, @raydistributed broke the $1 / TB barrier and set the world record at $97! 🔥📈🥳🎂🎗️🥂
Ray Is the World’s Most Cost-Efficient Sorting System at $1/TB
From anyscale.com
76K
Robert Nishihara
@robertnishihara
Oct 23, 2025
I'm hiring for a new engineering role working directly with me to support our most sophisticated customers. Looking for someone who wants to work across the AI / AI infra stack, write / debug a ton of code, work directly with customers, move / learn super fast. DM me.
39K
Robert Nishihara
@robertnishihara
Dec 12, 2023
Function calls have been a massive gap in the open source ecosystem (and the most common feature request). We benchmarked function calling on a variety of open and proprietary models. Impressively, Mistral-7B performs on par with GPT-3.5. Here's how they stack up 🤯🤯 ⚫️
Anyscale
@anyscalecompute
Dec 12, 2023
We're announcing new features and models today. 🔵 JSON mode ⚫️ function calling Try them out with our API. anyscale.com/blog/anyscale-…
216K
Robert Nishihara
@robertnishihara
Oct 15, 2025
If you're curious why LLM inference is different from regular model inference (and why we've seen so much investment in specialized LLM inference engines like vLLM, SGLang, and TensorRT-LLM), I had a lot of fun talking through some of the main ideas here.
Linda Vivah (Haviv)
@lindavivah
Oct 15, 2025
Walk with @robertnishihara & I in NYC with 10% charge 🪫 as we talk through 5 key differences between 𝗟𝗟𝗠 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗩𝗦 𝗥𝗲𝗴𝘂𝗹𝗮𝗿 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 Let’s see how much we can get through before our mic dies! 🤣
00:00
56K
Robert Nishihara
@robertnishihara
Aug 11, 2023
This in-depth case study sheds light on when you can achieve GPT-4 level performance with a fine-tuned 7B parameter model. Take SQL generation as an example. Accuracy 🧿 Llama-2-7B: 3% 🧿 GPT-4: 79% 🧿 Llama-2-7B (fine-tuned): 86% Out of the box, GPT-4 crushes Llama-2
kourosh hakhamaneshi
@CyrusHakha
Aug 11, 2023
🚀 Exploring Llama-2’s Quality: Can we replace generalist GPT-4 endpoints with specialized OSS models? Dive deep with our technical blogpost to understand the nuances and insights of fine-tuning OSS models. 🔗anyscale.com/blog/fine-tuni… 🧵 Thread 1/N👇
178K
Robert Nishihara
@robertnishihara
Dec 18, 2022
While everyone's talking about training giant models, companies like @Instacart are quietly achieving 10x performance improvements by training and deploying many smaller models. Here's how they're doing it. 🔥🔥
Training 1 Million ML Models in Record Time | Anyscale
From anyscale.com
65K
Robert Nishihara
@robertnishihara
May 17, 2023
We've built a ton of #LLM applications recently. Reasoning about performance & feasibility is painful without reference points. Here are the reference points we use to anchor our intuition (inspired by @JeffDean's "Numbers every engineer should know"). github.com/ray-project/ll…
81K
Robert Nishihara
@robertnishihara
Oct 12, 2023
An important systems bottleneck when working with LLMs is model loading times, but if you get the details right, you can speed up standard implementations by around 20x (over 10 minutes down to around 35 seconds for Llama-2-70B). There are a few bottlenecks to numbers to think
Cade Daniel 🇺🇸
@cdnamz
Oct 11, 2023
How long does it take to download Llama2 70B? On the 4x 25 Gbps NICs that aws.p4de's have, it should take ~10s. Yet in production we've observed much higher times, which makes autoscaling less responsive + more expensive. This blog post shows how we've reduced download & init
172K
Robert Nishihara
@robertnishihara
Dec 15, 2023
Faster Mixtral? Much more to come here. We make deep investments in open source AI. If you'd like to help build open source AI or optimize LLM performance, join us at @anyscalecompute. DM me 🚢
Woosuk Kwon
@woosuk_k
Dec 14, 2023
We've just released v0.2.5 which includes this performance improvement (contributed by Antoni at @anyscalecompute). Please try it out!
70K
Robert Nishihara
@robertnishihara
Jan 20, 2023
One of our goals with @raydistributed has been to provide a great off-the-shelf experience for beginners as well as the performance and flexibility required by power users. @OpenAI is on the "power users" end of the spectrum.
How Ray, a Distributed AI Framework, Helps Power ChatGPT
From thenewstack.io
51K