A proposal for a new definition of an "open source" LLM and why no definition will ever just work.
This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/an-open-source-llm
00:00 The koan of an open-source LLM
03:22 A new naming scheme for open LLMs
07:09 Pivot points and politics
08:16 Claude 3, arms race, commoditization, and national security
10:01 Doomers debunking bio risks of LLMs themselves
11:21 Mistral's perceived reversal and the EU
13:22 Messy points: Transparency, safety, and copyright
13:32 The muddling of transparency
15:22 The muddling of "safety"
16:30 The muddling of licenses and copyright
20:12 Vibes points and next steps
Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/open-source/img_046.png
Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/open-source/img_064.png
The koan of an open-source LLM