Making sense of research casting doubt on the potential of RLVR and where I'm optimistic for the next phase of scaling.
Share this post
Reinforcement learning with random rewards…
Share this post
Making sense of research casting doubt on the potential of RLVR and where I'm optimistic for the next phase of scaling.