Making sense of research casting doubt on the potential of RLVR and where I'm optimistic for the next phase of scaling.
Reinforcement learning with random rewards…
Making sense of research casting doubt on the potential of RLVR and where I'm optimistic for the next phase of scaling.