Decoupling AI from the latent variable of spoken languages
How English being the language of progress in AI could bias our machine’s minds and what we think they are capable of. Uncoupling AI from our notion of language is frightening.
I’m not one for hype on artificial general intelligence (AGI): I do think we should study methods to mitigate its risk, but I don’t think “it” is happening in my lifetime. I am interested in how human biases will impact the path we take to get there. The path towards AGI is bound to create many impactful inventions. We are in a time when many of our brightest minds are attempting to re-create conditions their brain uses to succeed1.
It is clear that the potential for disruption and unexpectedness are high with respect to AI. Regulation of digital technologies will become tighter, but given the corporate usefulness (and therefore investment) in AI, innovation is a when rather than if. The most compelling people around me are unique, driven, and somewhat unpredictable. These types of people are creating AIs as a reflection of their image and their worldview — these biases are a feature, not a bug, and a feature worth studying (yes, there can be harmful features).
There are a few specific environmental conditions that bias each of us, and the AI industry at large. The two variables that have interested me of late:
Educational and experiential background (such as in my post comparing how different engineering degrees train people to think), and
Language, society, and other large scale structures that constrain thought.
Heuristically, this list is micro effects and macro effects on neural development. The micro effects fine-tune what people work on and their perspective, and macro effects have large, frequently uninvestigated pressures on the vector of technological progress.
This post focuses primarily on how our language could affect AI development through the lens of how our languages impact our cognition and perception.
How Current AIs Think
It is important to consider how the AIs we rely on today think. Many people would say they are not very intelligent in the clever sense of the word, but they are powerful via computation. Most of our AIs generalize via data and pass information forward nearly exclusively (RL is an interesting divergence from this theme), so they still represent relatively simple cognitive structures — advanced lookup tables of sorts (simplification). This is required not the case going forward.
Deep learning will slow, new techniques will populate. I expect some of these techniques to more directly depend on notions of time (feedforward, feedback, mix up connections). Newer sorts of intelligence are likely to have more direct notions of motion (NNs seem like static compute graphs), and these motions are likely to involve a wider breadth of descriptions. These descriptions will start in our own language, but as computers gain their own agency, this may no longer be the case.
(Some points about how neural nets are thinking are debatable. I am curious what you think, feel free to comment below).
Through the Language Glass: Why the World Looks Different in Other Languages
The starting point for my question at the intersection of linguistics and technological anthropology. I finished reading this book by Guy Deutscherand it really was making me question how my mind works. Great content, okay writing. I still would recommend it.
Anyways, this is not a book review. This is a brief summary of why the world looks different in other languages so that we can discuss how AI being built in other languages, such as Chinese — the likely next candidate, would (maybe) change things. The crucial theory of the book is that different cultures develop different languages to reflect habitual, environmental, and biological differences between groups. Linguistically, English is a language with some weird properties, but generally capable of expressing what the speaker wants in the level of detail they want.
Let me explain, some languages require gendering nouns (some languages even have one word for he and she!, Wikipedia) or explicitly including when an event happened (recent, somewhat recent, or long-ago) whenever recalling that past. In English, the speaker is at their discretion to include these relevant details when they deem it necessary.
Additionally, the language in which the brain thinks can change how it sees the world. Experiments have been run to study the relative distance between colors, so when comparing color swatches from English to Russian speakers, opinions can fracture. Russian has two blues: light blue (голубой, goluboy) and dark blue (синий, siniy) — so two colors near the central (maybe Navy) blue can seem far apart as they would be classified as two different words (paper).
An important item to note in the context of AI is that languages do not constrain what we are capable of thinking, but just what can be readily expressed (people run experiments with aboriginal peoples and they learn to express modern concepts that do not have a word in their native tongue). There is a lot more to be studied in terms of linguistics intersecting with neuroscience, but this does mean a lot of things we may assume to be true is a reflection of our language rather than our being.
This has got me thinking about how our language impacts what we think is capable of AIs, and how AIs are capable of thinking in ways that we cannot describe with words.
Latent structures
Do human languages have common latent variables? Latent structures are the processes that determine what we think and do, mostly behind the scenes. Computers will likely have their own set of latent structures we cannot understand.
Latent variables
Important to understanding this post and its implications is the concept of latent variables. Latent variables are underlying properties of a system that cannot be measured. In modern machine learning, these involve everything from generative models broadly to deep generative models to autoencoder structures. Generally, generalization through the data comes by moving the data into a latent space that is more conducive to solving the optimization (visualization of the latent space of visual models has always been intriguing, if you have an example for audio or another type of deep learning model, I would love to explore it).
Latent variables are structures generated internally to a model. The human brain’s latent variables are almost entirely not understood, and that is beautiful. Not understanding the latent variables of AIs we create is interesting, but potentially off-putting when these get more power.
Latent bias
Languages are not developed equal (did languages start in one region, or did multiple human tribes develop them concurrently?). As a case point very relevant to ideas of robotics and control, there are languages that do not have ways of expressing ideas in local coordinates. Speakers of Guugu Yimithirr only can express locations and events in the context of global coordinates: North, south, east, and west. Due to this, the speakers become an expert in describing events in this case. Some languages have these languages and traditions to help develop them, such as only sleeping in certain cardinal directions. It is far fetched given English’s more modern upbringing, but imagine if we were working on AI in a language with such a limitation? As a roboticist, trying to visualize a world without body and inertial frames seems daunting.
This type of idea, or what our language could be missing, really rings clear in the context of machine intelligence. English may lack a term that is central to the way computers process information in the future. There are likely some shortcomings already, but sadly I don’t have the time to pursue a Ph.D. in linguistics on top of the one I am making my way through now.
There could be a developing mismatch between what our programs are doing and what we are capable of explaining. In the context of creating new forms of intelligence, it would be one of the less surprising cases for a concept that we struggle to create a new word for (rather than something like a new fruit, such as a pluot).
Untapped potential
If we are in a simulation, humans developing AI is a similar study to humans looking at the languages of aboriginal tribes not influenced by external society. Inevitably, our languages are optimized for creating functioning cultural societies in small tribes and are evolving to large scale societies (the evolutionary time constant on languages seems to be much faster than genetics, but it is not well studied). Our languages are not designed to be optimal for computers.
There is a whole other discussion here around program languages, but I think program languages a bit more constrained than the potential of AI. Maybe this is me expecting languages to be static, but a computer could re-create a vernacular to fit its needs and purpose. The latent variable of an emergent language inside a general machine intelligence would be fascinating (especially one that could communicate with us). A singleton does not need to share its language, but it could benefit from it.
Hopefully, this post fed into your imagination and reminded you that so much more than we can visualize is possible (beyond even our technological infancy, I am talking about constraints on our mind!).
Snacks
I’ve paused the Tangentially Related newsletters, so I will include the best here.
Thoughts
I put together a resource on visualization and the art of paper writing. I am thinking a lot about luck, growth, design, and the complex job of figuring out what people want to see (blog coming soon). There is untapped potential in this blog and in the thoughts of so many of you, but getting these thoughts down and in front of people who would enjoy it is the problem.
Reads
I am learning about tech policy and how things could set back technological development if not done carefully.
I am reading Untamed, and it is a great reminder that it is worth being yourself and you are the only one who can.
I made a new website where you can learn about machine learning, robotics, and me.
If you want to support this: like, comment, or reach out!
Whole brain emulation is an aside, that science is further from being developed https://en.wikipedia.org/wiki/Mind_uploading. See Superintelligence if interested.