9 Comments
User's avatar
Tommaso Maria Ricci's avatar

The bet I'd add to the list: open weight communities will develop their own post-training ecosystems faster than anyone expects. Once fine-tuning becomes a commodity, the differentiation isn't the base model -- it's who has the best RLHF pipeline and labeled preference data. Your RLHF book is going to land at exactly the right moment for that transition. What's your read on whether the preference data problem gets solved through synthetic data or community-sourced labels?

Nathan Lambert's avatar

The likes of Tinker, Prime Intellect, and others in this area are growing very fast, so this is true already in a way.

The preference data and human feedback part of this is very low as % of importance. Mostly finetuning for specific capabilities.

Eduardo Farina's avatar

What are the top open weight labs in the US? And with the focus shifting to post training, don’t you believe this gap can become smaller between closed and open source solutions / harnesses?

Nathan Lambert's avatar

Harnesses definitely. Copying my list of open model builders in the US here: https://www.interconnects.ai/i/179633798/whos-building-serious-open-models-in-the-us

Will's avatar

A few thoughts:

8 (and 10). U.S. regulators banning certain open models would be monumentally short-sighted and an unforced error that I pray they don't make. I don't want to discount the potential security issues that highly capable models (open or not) may invite, but bits will not be locked down by national borders as you point out. A ban would be somewhat akin to security through obscurity--it doesn't work.

12. Is there future potential for a Folding@Home equivalent with AI? The people buying GPUs for local inference may overlap heavily with those most interested in open models as a safeguard.

13. I speculated on LinkedIn that there may be a short window in which local agents are used by anybody but the most dedicated hobbyists. I used running your own email servers as an example of where we collectively have decided that the we are fine to mostly use a few centralized email providers in order to combat spam. I imagine this may end up being the case with AI agents as well.

Edward Grundy's avatar

I wonder if the architecture innovations will slow and the difference will come from training data. If the issue is data specialisation, open source models must surely dominate.

Steve Newman's avatar

How do "Chinese open-weight labs focus slightly more on benchmark scores than comparable closed labs in the U.S." and "To date, closed models tend to be more robust and generally useful than similarly scoring open models" relate? The latter seems to imply a larger gap than the former, but perhaps I'm misreading.

Is your sense basically that if x=slightly ahead, then the leading (closed) US labs are 1x ahead on real-world utility relative to benchmarks (#3 on your list), and also 1x ahead on benchmarks, adding up to 2x ahead on usefulness (#4 on your list)?

aydi's avatar

As open models approach the Claude Mythos level of capabilities, we both expect there will be external pressure (presumably mainly from governments) to restrict model access, weights, and so on. I could see this coming more from the U.S./non-Western countries rather than China, which would have downstream effects of slowing down the adoption of U.S. open models. Is this something that has been factored into your prediction of “Early 2027” being the tipping point of shifting open model adoption patterns?

Nathan Lambert's avatar

I don't think Mythos-style pressure will block adoption, rather it will try and paint red lines on releasing *certain types of models*. It's not about building them or using them largely, as those are both independent.

Personally, I think the Mythos model will most likely to turn out as a MINOR bump in risks, primarily by closed similar models proliferating widely in the next 12 months, but I have priced in some % chance that it is indeed a red line in performance which comes with substantial more scrutiny on the next generation of capabilities in open models, which largely would be released in 2027.