Discussion about this post

User's avatar
David Berreby's avatar

Great post, Nathan. Curious about a couple of things.

1. What's the equivalent of a word (or part of word) in the realm of actions? IOW, what does a token represent in the world of actions? "Text in, audio out" works because there is a huge database of sound on which the model was trained, right? But how can there ever be a huge database of actions, given that actions depend on the specific embodiment and specific context of each robot? (I hope it's obvious that this isn't a rhetorical question. I'd really like to better understand how robot-makers are arriving at these models.

2. Don't you think there is going to be a lot of hesitation about letting human tele-operators peer into our diaper changes and snack-sneaking and other home situations? I bet people will be creeped out at the thought. Or do you think convenience is going to trump such concerns, as it has in the past? In the current anti-tech climate I just wonder if people are going to let home-helper robots collect this kind of data (especially when a human backup driver is involved).

Rs's avatar

Can you elaborate on your view of Covariant? Are you saying they are using an older approach and not adapting to a tokenized world?

2 more comments...

No posts

Ready for more?