14 Comments
User's avatar
Nimish Sanghi's avatar

Like the balanced analysis of the issue.

Nathan Lambert's avatar

Is why I do what I do!

Jordan Schneider's avatar

Or deepseek was just sneakier about it and only 150k they could attribute with confidence

“In the scale of training a language model, 150K samples is only scratching the surface as a substantive experiment. It looks like they were experimenting with some rubrics, which could’ve been for an online RL run, but that’s extremely unlikely with how distributed the access was, and then some minor stuff on completions for sensitive queries. This usage of Anthropic’s API will have a negligible impact on DeepSeek’s long-rumored V4 model (or whichever model the data here contributed to). This was also very likely a small team at DeepSeek and unknown to much of the broader training organization.”

Nathan Lambert's avatar

What Anthropic found is just the tip of the iceberg (most likely).

Jordan Schneider's avatar

Poor deepseek intern…

Jordan Schneider's avatar

No reason for bytedance and Ali not to be doing the same thing—my guess is they were just sneakier

Nathan Lambert's avatar

They have way more GPUs? At least bytedance does.

Polymathematics's avatar

what makes one learning technique more or less ethical than others?

isn't the entire business of this current generation of AI all about coming up with novel ways to engineer a learning process?

outside of specific points (eg tos violations, which would need to be debated in court for a firm answer), the entire surface area of innovation should be as open as possible

and if it's not, the principle should apply equally - not to specific jurisdictions or arbitrary but particular techniques

Chinese Cooking Demystified's avatar

AI companies: scrape the output of the entire internet without attribution

Also AI companies: “how dare you scrape the output of our tool!”

I like their framing of it as an “attack”. If that’s the case then the entire industry’s been pillaging content for years

Jonathon P Sine's avatar

Have you seen this video and his thread? Curious your opinion. He also raises a serious contention RE claims on MiniMax and Moonshot that not sure you note (about half way through the vid). Cheers https://x.com/theo/status/2026199981179449409?s=46

Nathan Lambert's avatar

I don't think Anthropic would lie about this. I also don't think doing what the Chinese labs are doing is particularly bad faith, as the LLM terms of service have been routinely violated for years.

Kevin Xu's avatar

"It’s clear from their open research that Chinese labs have excellent RL infrastructure, despite the compute shortages."

Is this at least in part due to the resources need in strong RL environment being skewed more towards CPUs, and access to CPUs are less constrained and falls more or less outside of current export control regime? (Of course not discounting Chinese AI labs' talent is strong, constraints breed innovation, etc. etc.)

Nathan Lambert's avatar

I think getting rl right is mostly hard infra problems and needing good GPUs. The cpus matter very little.

messyfork's avatar

Are tasks like coding still GPU limited in post-training/RL? Maybe its just from reading the semianalysis post about CPU demand going up but it made sense to me that just providing enough coding playgrounds for agents to RL in might have become the bottleneck.