Discussion about this post

User's avatar
Yaroslav Bulatov's avatar

Hi Nathan, I enjoyed this post :) Also, I'm very impressed with the work AI2 is doing. After watching Hanna's talk at COLM, we got excited about OLMO and we started #ext-ai2-together collaboration channel between AI2 and Together (join!)

I have a more benign view of Meta's role in this. First of all, I'm passionate about open-source AI, writing about it 5 years ago for instance (https://medium.com/@yaroslavvb/large-scale-ai-and-sharing-of-models-4622ba59ec18). Also, I had worked at Google and Meta before Together.AI, which gives me an internally fine-tuned model of their operations.

Meta has a different source of funding from non-profits, they don't compete for it against AI2. Llama success shouldn't hamper development of truly open models. If anything, it should help in development -- having a "partially-open" foundation model at your fingertips should make it easier to build a great "truly open" foundation model. Imagine building a car from scratch, vs having a prototype you can fully disassemble to get inspiration.

There's an LLM built into every one of Meta's products nowadays, so they have to train a high-quality LLM anyway. Once the model is trained, it doesn't cost them anything to open-source it, so why not? This is similar to how Meta motivates open-sourcing their hardware designs, around the Open-Compute project.

Average tenure at Meta is <2 years, short relative to other Big Tech, so making an internal project open-source gets the benefit of more productive employees. People work harder knowing they can benefit from their work after they leave the company. Also, I feel like engineers can influence high-level decisions by obtaining high-level buy-in for decisions that ultimately also benefit them. It would be a strange coincidence that Meta relinquished control of PyTorch in favor of Linux Foundation shortly before the major players left PyTorch team to start Fireworks. In addition of what Zuck says, there are many people inside Meta who are personally invested in having the weights of Llama publicly available, and they know how to make a case internally.

Google is secretive about their hardware designs because they are in the business of selling compute, Meta is not, so they share. OpenAI is secretive about their model weights because they are in the business of selling access to these weights, Meta is not, so they share. On the other hand, Meta would not voluntarily share the social network graphs they discovered or the media they've been uploaded, because that's their core business.

Expand full comment
Ani N's avatar

I broadly agree with this post that the benefits of open source AI currently clearly outweigh the risks, and have used the OLMo checkpoints in my own research. However, I do think that the statement:

"Many of the risks that “experts” expected open-source models to proliferate, from disinformation, biosecurity risks, phishing scams, and more have been disproven"

does not reflect the paper you cite, which instead shows that existing work is insufficient to prove that the risks exist. Furthermore, its been a year since high quality open source models have been released for LLMs, and it will be years until we know all the ways these models can be used. It could be that Llama3 405B with the right finetuning and a voice model is so good at running phone based spearphishing scams that Meta comes under fire for releasing it. Maybe it requires learning how to browse the web on a dataset opensourced tomorrow. We likely won't know until it starts happening or a red team attempts to do this and fails(which has happened for some use cases, like biosecurity risks)

Despite this, it is critical is that FULLY open models like OLMo keep up because the difference for bad actors between just Llama being and released and both OLMo and Llama being released in minimal, but the difference for researchers is enormous.

Thanks for reading!

Expand full comment
8 more comments...

No posts