My bets on open models, mid-2026
What I expect to come next and why, focused on the open-closed gap.
We’re living through the period of time when we’ll learn if open models can keep up with closed labs. The obvious answer is that no, they won’t. This answer is a form of saying they won’t keep up in every area. This framing closes off a popular prediction where the open models completely catch up, as in all models saturate and open and closed models only become increasingly similar. In living through this, it’s evidently very unclear when the longer-term stable balance of capabilities will solidify.
This is a very complex dynamic, where the core point we monitor is a capability gap between models. At the same time, this gap is intertwined with evolving dynamics in the funding of open models, who builds open models, how techniques like distillation that enable fast-following translate through new application domains, potential regulation hampering the open-source AI ecosystem, and of course who actually uses open models.
The capabilities gap is one signal in a complex sea of forces, pushing supply and demand into different shapes. In many cases the demand — where obviously tons of individuals, organizations, and sovereigns want, or need, open models — is largely separated from supply. Supply is fully dictated by economics. The question of “which business strategies support releasing open models” is still at stake.
With this complexity, I wanted to distill my key beliefs down into a clear list. These are downstream of 10+ pieces I’ve written or recorded on open models this spring (which are linked throughout).
It’s surprising that the top closed models did not show a growing capability margin over open models, based on compute differences for training and research, especially in the second half of 2025 and through today.
Open model labs are technically very strong at keeping pace on well-established benchmarks. This will continue and reflects a balance of abundant talent and sufficient computing power.
Chinese open-weight labs focus slightly more on benchmark scores than comparable closed labs in the U.S. Distillation helps the Chinese LLM companies do so, but it’s not a panacea. Changes in the distillation dynamic (e.g. regulation) will not be a determining factor on the balance of capabilities. This increase in focus is a natural evolution of their incentives in keeping the narrative on keeping up with the frontier alive, which is crucial to fundraising and adoption.
To date, closed models tend to be more robust and generally useful than similarly scoring open models. Closed models have certain hard-to-measure qualities that are not well captured in current or past benchmarks. This will be key to enabling closed models to dominate in markets where an individual user constantly presents new challenges, i.e. supporting knowledge workers as a direct assistant.
The open vs. closed model race, as monitored through benchmarks, will largely be a game of economic staying power and fast-following, until the market structure constricts. I expect Chinese open-weight labs to face funding difficulties first, as soon as later this year. Funding difficulties will be seen in different capability trajectories 3-9 months later.
The RL dominated training era has increased the relevance of distribution to real-world use-cases as a key factor in continued capabilities improvements. These are tasks where users directly use tools like Claude Code or Codex to solve problems in their job with agents. This is the first clear technical area that closed labs can dominate open-weight models on capabilities, potentially leveraging online RL directly based on user feedback.
Open models will be increasingly adopted in repetitive automation tasks, as measured in the relative share of the API market, for repetitive tasks across the ecosystem. This takes the form of many new AI-native applications, business backend automation, etc. The success of this will drive more investment in domain-specific, efficient open models.
This is a complex picture, where the long-term trajectory is more of an economics question rather than an ability one. Many other outlets can paint a far more simplistic narrative that “China will assuredly catch us in AI” and get more distribution because it is a simple story. The reality is complex. Only real AI revenue begets more investment, eventually that’ll be linked to the ability to keep improving models at a rapid rate. Economic realities have not yet impacted scaling open models, as a general category.
This economic-focused angle relates to my positions on the open model ecosystem more broadly.
Recurring calls to ban certain types of open models will continue to come but are in practice impossible to implement. Training strong AI models (i.e. near but not at the frontier) is a relatively small cost compared to large-scale deployments. E.g. if the U.S. bans open models over a certain compute threshold, another sovereign entity will eventually train them and release them publicly, with the models entering the U.S. market with less oversight.
The second derivative of influence on open models has shifted, and the U.S. will slowly regain ground in adoption metrics of open models starting in early 2027 (it takes a long time for China’s velocity to slow, then flip). Examples include Google’s Gemma 4 (a wild success), Nvidia’s Nemotron, and Arcee AI.
As ever-stronger closed models are built, previewed, and released, there will be more safety-shocks saying that open-weight versions of the strongest AI models never can be allowed to exist, similar to reactions to Claude Mythos. These can spur burdensome regulation on open models.
With the above, there will also be increased long-term interest in open models, as sovereign entities and existing power structures realize the coming, super powerful AI tools cannot land in the hands of only one or a few companies. These entities will see open models as a different governance paradigm.
New funding structures for open models will emerge, as many stakeholders realize dependencies on single, for-profit companies for access to intelligence are unreliable.
Local agents, OpenClaw, and other personal agents represent a large, to date, mostly ignored market for open model usage. It is a sort of dark matter, with pervasive, massive potential for influence on the balance of open-to-closed models.
A single word governs this post and is intentionally repeated — complex.
This complex reality has been driving me to think more deeply about how to clearly describe the open model gap, and why I can hold it in my head that I expect American closed labs to clearly draw ahead, despite the fairly unequivocal evidence in support of the capabilities of recent open-weight models. More on the nuance in the open-closed gap in another piece coming soon, so please subscribe!
Let me know any positions that I missed.

