Interconnects
Interconnects
Why reward models are still key to understanding alignment
0:00
-7:43

Why reward models are still key to understanding alignment

In an era dominated by direct preference optimization and LLMasajudge, why do we still need a model to output only a scalar reward?
This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.
Source code: https://github.com/natolambert/interconnects-tools
Original post: In an era dominated by direct preference optimization and LLM-as-a-judge, why do we still need a model to output only a scalar reward?

Podcast figures:
Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reward-models/img_004.png
Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reward-models/img_009.png

0:00 Why reward models are still key to understanding alignment

Discussion about this podcast

Interconnects
Interconnects
Audio essays about the latest developments in AI and interviews with leading scientists in the field. Breaking the hype, understanding what's under the hood, and telling stories.