On Learning Chinese, Brains, and AI

Ron Lunde
3 min readFeb 20, 2023

Last year I decided to learn a tiny bit of Chinese (中文). I don’t have a good excuse— I’m not planning a trip or anything. It just sounded really difficult (hence fun), and very different from anything I’ve ever tried. I’ve never been exposed to a tonal language or anything character based (also called logograms) like the “simplified Chinese” that online sources primarily teach.

I haven’t actually learned much yet, but I’ve noticed some things I’m finding even more interesting than the language itself — namely, how my brain is (or mainly, isn’t) working.

For example, I’ve discovered that I can read what I’ve learned so far relatively easily, and I can pick out characters on the screen and assemble them into a sentence, but I can’t even begin to recall and draw most characters with a pen and paper.

I can usually hear the difference between tones, but when I try to pronounce them it sounds like I do when the dentist has numbed my mouth and stuck various sharp things in there, and asks “so, you have any fun plans for the weekend?” (Come to think of it, if anyone could understand my Chinese, I’ll bet a dentist could!)

I’ll get to the AI part shortly.

That odd difference between input and output reminded me of a book I read ages ago by Carl Sagan, Broca’s Brain. The title essay, according to Wikipedia, “is named in honor of the French physician, anatomist and anthropologist Paul Broca (1824–1880). He is best known for his discovery that different functions are assigned to different parts of the brain.” More importantly for me right now is that Broca’s Area is a part of our brain’s frontal lobe that is used for language production.

There’s a whole different part of the brain, Wernicke’s area, that has more to do with language reception.

If someone suffers brain damage they might have aphasia (a “lack of” language) but it can take different forms. They might not be able to understand speech, but can speak it. Or they might not be able to speak themselves, but can understand it. (Or a lot of other even weirder types.) Cool, right? (And tragic.)

Here’s the AI part: it may seem like AI research and development is divided and specialized, where each skill is progressing independently, but some of them are starting to come together. I’ll just refer to OpenAI models here for simplicity.

ChatGPT gives the appearance of “understanding” input text as well as generating it (and appears to carry on a conversation due to secretly feeding in the last portions of output back into the input in order to “continue” a conversation), but it’s mostly just generating it. So it’s analogous to Broca’s area.

Whisper gives the appearance of “understanding” speech (and turning it into text). That uses a Transformer architecture to achieve the same kind of thing we do when we use context to understand speech. Consider “stake” versus “steak”. Even though they sound exactly the same, we would know which it was without thinking if someone said “she drove a stake through the vampire’s heart” versus “the doctor said he had to cut back on steak for his heart”. Whisper seems to me to be analogous to Wernicke’s area.

The cool thing is that AI seems to be getting really good at various pieces of cognition in the same way that our brains do: language generation, speech recognition, image generation, image detection, planning (e.g. AlphaGo), and so on.

So, “all we need to do” is put the pieces together. I believe there is some disagreement among researchers about whether that’s best done with a single gigantic multimodal model or via separate subsystems. It’ll be fun to see what develops!

It’ll be interesting to see if I ever put the pieces together in Chinese as well. It’s OK if it takes a while. As it says in the Tao Te Ching:

真正的旅行者没有确定的计划,也不在意到达目的地

“A good traveler has no fixed plans and is not intent upon arriving”

Coincidentally, that’s my current approach to learning about modern AI too. (I went to grad school to study it long ago, but that was when we were mostly scratching algorithms on cave walls.)

--

--

Ron Lunde

I write software and stories. I try to make people laugh (with the stories, not the software).