Thoughts on Toby Ords AI Scaling Series

By Srdjan Miletic @ 2026-02-04T00:46 (+51)

This is a linkpost to https://www.dissent.blog/notes-on-toby-ords-ai-scaling-series/

I've been reading Toby Ord's recent sequence on AI scaling a bit. General notes come first, then my thoughts.

Notes

Takeaways

I think there are two things I got from this series: a better model of the three phases modern LLM scaling has gone through, and how LLM training works generally, and an argument for longer timelines.

The model of scaling is basically

This is also a case for, well, not AI risk being that much lower. I think there are actually two key takeaways. One is that the rate of progress we've seen recently in major benchmarks doesn't really reflect the underlying progress in some metric we actually care about like "answer quality per $". The other is that we've hit or are very close to hitting a wall and that the "scaling laws" everyone thinks are a guarantee of future progress are actually pretty much a guarantee of a drastic slowdown and stagnation if they hold.

I buy the first argument. Current benchmark perf is probably slightly inflated and doesn't really represent "general intelligence" as much as we would assume because of a mix of RL and inference (with the RL possibly chosen specifically to help juice benchmark performance).

I'm not sure how I feel about the second argument. On one hand the core claims seem to be true. The AI Scaling laws do seem to be logarithmic. We have burned through most of the economically feasible orders of magnitude on training. On the other hand someone could have made the same argument in 2023 when pre-training was losing steam. If I've learned one thing from my favourite progress studies sources it's that every large trend line is composed of multiple smaller overlapping S curves. I'm worried that just looking at current approaches hitting economic scaling ceilings could be losing sight of the forest for the trees here. Yes the default result if we do the exact same thing is we hit the scaling wall. But we've come up with a new thing twice now and we may well continue to do so. Maybe it's distillation/synthetic data. Maybe it's something else. Another thing to bear in mind is that, even assuming no new scaling approaches arise, we're still getting a roughly 3x per year effective compute increase from algo progress and a 1.4x increase from hardware improvements, meaning a total increase of roughly an order of magnitude every 1.6 years. Even with logarithmic scaling and even assuming AI investment as a % of GDP stabilizes, we should see continued immense growth in capabilities over the next years.


Toby_Ord @ 2026-02-04T16:38 (+21)

That's a good summary and pretty in-line with my own thoughts on the overall upshots. I'd say that absent new scaling approaches the strong tailwind to AI progress from compute increases will soon weaken substantially. But it wouldn't completely disappear, there may be new scaling approaches, and there remains progress via AI research. Overall, I'd say it lengthens timelines somewhat, makes raw compute/finances less of an overwhelming advantage, and may require different approaches to compute governance.

Davidmanheim @ 2026-02-11T20:37 (+2)

Strong agree that absent new approaches the tailwind isn't enough - but it seems unclear that pretraining scaling doesn't have farther to go, and it seems that current approaches with synthetic data and training via RL to enhance one-shot performance have room left for significant improvement.

I also don't know how much room there is left until we hit genius level AGI or beyond, and at that point even if we hit a wall, more scaling isn't required, as the timeline basically ends.

Vasco Grilo🔸 @ 2026-02-04T20:17 (+2)

Thanks for the great post, Srdjan. I strongly upvoted it.