Discussion about this post

User's avatar
Tommaso Maria Ricci's avatar

The bet I'd add to the list: open weight communities will develop their own post-training ecosystems faster than anyone expects. Once fine-tuning becomes a commodity, the differentiation isn't the base model -- it's who has the best RLHF pipeline and labeled preference data. Your RLHF book is going to land at exactly the right moment for that transition. What's your read on whether the preference data problem gets solved through synthetic data or community-sourced labels?

Eduardo Farina's avatar

What are the top open weight labs in the US? And with the focus shifting to post training, don’t you believe this gap can become smaller between closed and open source solutions / harnesses?

4 more comments...

No posts

Ready for more?