Sitemap - 2024 - Interconnects

2024 Interconnects year in review

(Voiceover) 2024 Interconnects year in review

OpenAI's o3: The grand finale of AI in 2024

(Voiceover) OpenAI's o3: The grand finale of AI in 2024

The AI agent spectrum

(Voiceover) The AI agent spectrum

OpenAI's Reinforcement Finetuning and RL for the masses

(Voiceover) OpenAI's Reinforcement Finetuning and RL for the masses

Interviewing Finbarr Timbers on the "We are So Back" Era of Reinforcement Learning

OpenAI's o1 using "search" was a PSYOP

(Voiceover) OpenAI's o1 using "search" was a PSYOP

OLMo 2 and building effective teams for training language models

(Voiceover) OLMo 2 and building effective teams for training language models

Tülu 3: The next era in open post-training

(Voiceover) Tülu 3: The next era in open post-training

Scaling realities

(Voiceover) Scaling realities

Saving the National AI Research Resource & my AI policy outlook

(Voiceover) Saving the National AI Research Resource & my AI policy outlook

Interviewing Tim Dettmers on open-source AI: Agents, scaling, quantization and what's next

Interviewing Andrew Carr of Cartwheel on the State of Generative AI

Why I build open language models

(Voiceover) Why I build open language models

Artifacts 5: Deepseek's Janus, I'm writing a Mini RLHF book, Qwen 2.5, video datasets, audio models, and more

Claude's agentic future and the current state of the frontier models

(Voiceover) Claude's agentic future and the current state of the frontier models

Interviewing Arvind Narayanan on making sense of AI hype

Building on evaluation quicksand

(Voiceover) Building on evaluation quicksand

Interviewing Andrew Trask on how language models should store (and access) information

How scaling changes model behavior

AI Safety Culture Confronts Capitalism

[Article Voiceover] AI Safety's Crux: Culture vs. Capitalism

Interviewing Riley Goodside on the science of prompting

[Article Voiceover] Llama 3.2 Vision and Molmo: Foundations for the multimodal open-source ecosystem

Llama 3.2 Vision and Molmo: Foundations for the multimodal open-source ecosystem

Artifacts Log 4: Reflection 70B, o1 on LMSYS, fine-tuning fine-tunes, and speech models

[Article Voiceover] Reverse engineering OpenAI's o1

Reverse engineering OpenAI’s o1

Futures of the data foundry business model

Futures of the data foundry business model

A post-training approach to AI regulation with Model Specs

A post-training approach to AI regulation with Model Specs

OpenAI’s Strawberry, LM self-talk, inference scaling laws, and spending more on inference

OpenAI's Strawberry, LM self-talk, inference scaling laws, and spending more on inference

OLMoE and the hidden simplicity in training better foundation models

OLMoE and the hidden simplicity in training better foundation models

On the current definitions of open-source AI and the state of the data commons

On the current definition of open-source AI and the state of the data commons

On Nous Hermes 3 and classifying a "frontier model"

Nous Hermes 3 and exploiting underspecified evaluations

Artifacts Log 3: Synthetic math and Magpie datasets, another 1T param model, and many Mistral models

Interviewing Ross Taylor on LLM reasoning, Llama fine-tuning, Galactica, agents

A recipe for frontier model post-training

A recipe for frontier model post-training

Interviewing Sebastian Raschka on the state of open LLMs, Llama 3.1, and AI education

GPT-4o-mini changed ChatBotArena

GPT-4o-mini changed ChatBotArena

Navigating Interconnects

Llama 3.1 405b, Meta's AI strategy, and the new open frontier model ecosystem

Llama 3.1 405B, Meta’s AI strategy, and the new, open frontier model ecosystem

SB 1047, AI regulation, and unlikely allies for open models

SB 1047, AI regulation, and unlikely allies for open models

Artifacts Log 2: Gemma 2, more Chinese LLMs, high quality datasets, and domain-specific training

Switched to Claude 3.5

Switched to Claude 3.5

Interviewing Dean Ball on AI policy: CA SB 1047, upcoming AI disaster response, Llama 3 405B, Chinese open-source AI, and scaling laws

RLHF roundup: Getting good at PPO, sketching RLHF’s impact, RewardBench retrospective, and a reward model competition

RLHF Roundup: Trying to get good at PPO, charting RLHF's impact, RewardBench retrospective, and a reward model competition

Frontiers in synthetic data

Frontiers in synthetic data

Text-to-video AI models are already abundant, but the products?

Text-to-video AI is already abundant

AI for the rest of us

AI for the rest of us

A case study in reproducibility of evaluation with RewardBench

A realistic path to robotic foundation models

A realistic path to robotic foundation models

We aren't running out of training data, we are running out of open training data

We aren’t running out of training data, we are running out of open training data

Name, image, and AI's likeness

Name, image, and AI’s likeness

Artifacts Log 1: Announcement, Llama 3 fine-tunes, SOTA reward model, human prompt datasets...

OpenAI chases Her

OpenAI chases Her

OpenAI's Model (behavior) Spec, RLHF transparency, and personalization questions

OpenAI’s Model (behavior) Spec, RLHF transparency, personalization questions

ChatBotArena: The peoples’ LLM evaluation, the future of evaluation, the incentives of evaluation, and gpt2chatbot

RLHF: A thin line between useful and lobotomized

How RLHF works, part 2: A thin line between useful and lobotomized

Phi 3 and Arctic: Outlier LMs are hints

Phi 3 and Arctic: Outlier LMs are hints

AGI is what you want it to be

AGI is what you want it to be

Llama 3: Scaling open LLMs to AGI

Llama 3: Scaling open LLMs to AGI

Stop "reinventing" everything to "solve" alignment

Stop "reinventing" everything to solve alignment

The end of the "best open LLM"

The end of the “best open LLM”

We disagree on what open-source AI should mean

Why we disagree on what open-source AI should be

DBRX: The new best open LLM and Databricks' ML strategy

DBRX: The new best open model and Databricks’ ML strategy

Evaluations: Trust, performance, and price (bonus, announcing RewardBench)

Evaluations: Trust, performance, and price (bonus, announcing RewardBench)

Model commoditization and product moats

Model commoditization and product moats

The koan of an open-source LLM

The koan of an open-source LLM

Interviewing Louis Castricato of Synth Labs and Eleuther AI on RLHF, Gemini Drama, DPO, founding Carper AI, preference data, reward models, and everything in between

How to cultivate a high-signal AI feed

How to cultivate a high-signal AI feed

Google ships it: Gemma open LLMs and Gemini backlash

Google ships it: Gemma open LLMs and Gemini backlash

10 Sora and Gemini 1.5 follow-ups: code-base in context, deepfakes, pixel-peeping, inference costs, and more

10 Sora and Gemini 1.5 follow-ups: code-base in context, deepfakes, pixel-peeping, inference costs, and more

Releases! OpenAI’s Sora for video, Gemini 1.5's infinite context, and a secret Mistral model

OpenAI’s Sora for video, Gemini 1.5's infinite context, and a secret Mistral model

($) Discord Access

Why reward models are key for alignment

Why reward models are still key to understanding alignment

Alignment-as-a-service: Scale AI vs. the new guys

Alignment-as-a-Service: Scale AI vs. the new guys

Open Language Models (OLMos) and the LLM landscape

Open Language Models (OLMos) and the LLM landscape

Model merging lessons in The Waifu Research Department

Model merging lessons in The Waifu Research Department

Local LLMs, some facts some fiction

Local LLMs, some facts some fiction

Multimodal blogging: My AI tools to expand your audience

Multimodal blogging: My AI tools to expand your audience

RLHF learning resources in 2024

Multimodal LM roundup: Unified IO 2, inputs and outputs, Gemini, LLaVA-RLHF, and RLHF questions

Multimodal LM roundup: Unified IO 2, inputs and outputs, Gemini, LLaVA-RLHF, and RLHF questions

Where 2024’s “open GPT4” can’t match OpenAI’s

Where 2024’s “open GPT4” can’t match OpenAI’s

It's 2024 and they just want to learn