Subscribe
Sign in
Home
Podcast
Navigation
($) Discord
Archive
About
Reasoning & Inference Compute
Latest
Top
Discussions
The rise of reasoning machines
And a debate that doesn't warrant repeating.
Jun 12
•
Nathan Lambert
46
Share this post
Interconnects
The rise of reasoning machines
Copy link
Facebook
Email
Notes
More
What comes next with reinforcement learning
Scaling RL, sparse rewards, continual learning, and the progress wall when pretraining really stops.
Jun 9
•
Nathan Lambert
46
Share this post
Interconnects
What comes next with reinforcement learning
Copy link
Facebook
Email
Notes
More
2
A taxonomy for next-generation reasoning models
Where we've been and where we're going with RLVR.
Jun 4
•
Nathan Lambert
55
Share this post
Interconnects
A taxonomy for next-generation reasoning models
Copy link
Facebook
Email
Notes
More
3
Reinforcement learning with random rewards actually works with Qwen 2.5
Making sense of research casting doubt on the potential of RLVR and where I'm optimistic for the next phase of scaling.
May 27
•
Nathan Lambert
76
Share this post
Interconnects
Reinforcement learning with random rewards actually works with Qwen 2.5
Copy link
Facebook
Email
Notes
More
OpenAI's o3: Over-optimization is back and weirder than ever
Tools, true rewards, and a new direction for language models.
Apr 19
•
Nathan Lambert
120
Share this post
Interconnects
OpenAI's o3: Over-optimization is back and weirder than ever
Copy link
Facebook
Email
Notes
More
RL backlog: OpenAI's many RLs, clarifying distillation, and latent reasoning
Notes I forgot to publish. Closing some loose ends in the reasoning model discussions.
Apr 5
•
Nathan Lambert
51
Share this post
Interconnects
RL backlog: OpenAI's many RLs, clarifying distillation, and latent reasoning
Copy link
Facebook
Email
Notes
More
Recent reasoning research: GRPO tweaks, base model RL, and data curation
The papers I endorse as worth reading among a cresting wave of reasoning research.
Mar 31
•
Nathan Lambert
75
Share this post
Interconnects
Recent reasoning research: GRPO tweaks, base model RL, and data curation
Copy link
Facebook
Email
Notes
More
2
Gemini 2.5 Pro and Google's second chance with AI
The end of a busy spring of model improvements and what's next for the presumed leader in AI abilities.
Mar 26
•
Nathan Lambert
75
Share this post
Interconnects
Gemini 2.5 Pro and Google's second chance with AI
Copy link
Facebook
Email
Notes
More
Where inference-time scaling pushes the market for AI companies
Fundamentals emerging downstream from the RL reasoning models.
Mar 5
•
Nathan Lambert
52
Share this post
Interconnects
Where inference-time scaling pushes the market for AI companies
Copy link
Facebook
Email
Notes
More
1
Claude 3.7 thonks and what's next for inference-time scaling
The latest reasoning model and what it says about the direction of inference time compute and RL training.
Feb 24
•
Nathan Lambert
79
Share this post
Interconnects
Claude 3.7 thonks and what's next for inference-time scaling
Copy link
Facebook
Email
Notes
More
1
An unexpected RL Renaissance
New talk! Forecasting the Alpaca moment for reasoning models and why the new style of RL training is a far bigger deal than the emergence of RLHF.
Feb 13
•
Nathan Lambert
59
Share this post
Interconnects
An unexpected RL Renaissance
Copy link
Facebook
Email
Notes
More
39:48
Why reasoning models will generalize
People underestimate the long-term potential of “reasoning.”
Jan 28
•
Nathan Lambert
203
Share this post
Interconnects
Why reasoning models will generalize
Copy link
Facebook
Email
Notes
More
7
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts