13 Comments
User's avatar
Andrew VanLoo's avatar

“Building complex production codebases, ~2027 (estimate, which will vary by the codebase)”

We are doing this right now with Claude Code.

Expand full comment
Cody Hergenroeder's avatar

I find that a lot of bold AI performance claims can be clarified by distinguishing between level of consistency. Claude Code can pretty much always one-shot a small project now. ✅ Piece of cake. But a production-ready codebase? Speaking from experience also using Claude Code every day, it's hard-fought. Every day I'm building real stuff, extremely rapidly. But I'm also wrestling Claude all day, trying to unhobble it. I'm looking forward to the innate smarts of the future models taking this pressure off my hands.

Expand full comment
Andrew VanLoo's avatar

No, we are definitely using it on multiple multi-thousand line projects. Following Uncle Bob's Clean Code and Clean Architecture principles, agile practice, and domain-driven design are key. You get it to follow those, and it can do magic.

I am using Claude Code and Codex to build a massively multiplayer online real time strategy game right now too, at roughly the same scale as my "normal" job. I love it because it can work on the frontend, backend, database, CI/CD pipeline and cloud infrastructure all in parallel, as long as you keep the immediate scope of change sufficiently small that it does not cover too much of each of these repositories.

The wisdom of Uncle Bob will rule the fate of many future production projects...

I wrote this to compile some quick thoughts on this:

https://mmm2099.substack.com/p/using-agentic-ai-at-scale

Expand full comment
Kevin Thuot's avatar

Which application do you use for GPT5-Codex? I tried using it in Cursor using the OpenAI Codex extension. It worked, but felt clunky. I'm wondering if you have a smoother approach. Thanks!

Expand full comment
Nathan Lambert's avatar

Direct in the CLI

Expand full comment
Andrew VanLoo's avatar

That’s been my fear in using Codex CLI too. I’ve been tempted to try just rigging Claude Code with GPT-5 using the router.

Expand full comment
Renaud Gaudron's avatar

Great article, but I don’t agree with your opening statement that very few people benefit from better theoretical maths. I believe that virtually everyone benefits from it. It's just very indirect, so most people don’t realise it. My bold take is that the fact frontier models are so good at maths is actually one of the biggest and most impactful developments in AI.

Expand full comment
Suhrab Khan's avatar

Coding remains the clearest lens into AI progress; every improvement in agents like GPT‑5‑Codex or Claude Code translates directly into real-world impact. The shift from chat to coding agents shows that product design and workflow integration are just as important as raw model capabilities.

Expand full comment
Gregory Forché's avatar

Great piece - thoughtful.

Expand full comment
Rob Spence's avatar

I need to try codex again. I still find that Claude code has the issue where when I go to put all the pieces together they rarely fit. Each might work individually but it’s like puzzle pieces that don’t come together. Then I spend a lot of time getting them to work together. Maybe there is something I can do to improve my use of CC or maybe it’s just in the natural effort shift when using AI coding agents.

Expand full comment
Kai Williams's avatar

Thanks for the piece! Your link to mathematics at the top seems broken?

Expand full comment
Nathan Lambert's avatar

fixed, is just IMO results.

Expand full comment
Kai Williams's avatar

Awesome, thanks!

Expand full comment