Hi Nathan, great write-up! What do you think of the implications on data as we didn’t naturally have action data in pre-training. The Tongyi-Research series papers start to add multi-step tool uses trajectories into pre-training. Would this save us from hitting the scaling law wall on the data front?
If “thinking, searching, acting” are the new primitives, maybe weights aren’t the real differentiator anymore. Is the real race less about bigger models and more about the scaffolding we build around them?
Hi Nathan, great write-up! What do you think of the implications on data as we didn’t naturally have action data in pre-training. The Tongyi-Research series papers start to add multi-step tool uses trajectories into pre-training. Would this save us from hitting the scaling law wall on the data front?
If “thinking, searching, acting” are the new primitives, maybe weights aren’t the real differentiator anymore. Is the real race less about bigger models and more about the scaffolding we build around them?