Interconnects
Interconnects
(Voiceover) OpenAI's Reinforcement Finetuning and RL for the masses
0:00
Current time: 0:00 / Total time: -12:39
-12:39

(Voiceover) OpenAI's Reinforcement Finetuning and RL for the masses

The cherry on Yann LeCun’s cake has finally been realized.

Original post:

https://www.interconnects.ai/p/openais-reinforcement-finetuning

Chapters

00:00 Introduction

04:19 The impact of reinforcement finetuning’s existence

07:29 Hypotheses on reinforcement finetuning’s implementation

Figures

Fig. 1, Yann’s Cake

Fig. 2, Grader config

Fig. 3, RLVR learning curves


Discussion about this episode