Discussion about this post

User's avatar
Rainbow Roxy's avatar

Thanks for writing this, it clarifies a lot. I totally agree that challenges in scaling RL, especially for academics, really mirror engineering headaches seen with MoE models. Your insight about predicting the learning curve is spot on; it’s such an important 'hill' to understand.

Expand full comment
JP's avatar

Did the scaling law hold up for your historical training runs?

Expand full comment

No posts