Having a discussion with some STEM buddies, not necessarily theoretically deep into ML/AI. But your paper and the questions of scaling seem to formalize some of the discussions we are having about the state and future of ML/AI - especially chatGPT. Cheers.

I don’t claim correctness of understanding, just a lingering uncertainty over how big a model will need to get to get to emergent behavior. And can we get to a lower power calculation technique, or segment the calculation to constrain the compute cost?

## Scaling laws for robotics & RL: Not quite yet

