Monday, 10 November 2014

Computer speech production vs recognition

I wondered here (or maybe I never posted it) about why the early computer scientists and science fiction authors believed that speech recognition would be easy for computers, but speech production would be difficult. A lot of sci-fi has robots and computers taking in verbal commands from humans and producing text in response, when it turns out that the exact opposite difficulty curve was encountered. If you give a computer some text, you can get a passable vocal representation of it easily enough, but recognising speech took a lot longer to get to that same point.

I think now I have some insight into why that misconception may have gained traction. I am currently (well, not this very second, but during this time in my life) watching my nephew develop the twin powers of speech and comprehension of the English language. It is clear that he understands a lot more words than he is capable of producing. He can point to his nose, eyes, head, knees, belly and so on in response to questions, but he can't say all of those words yet, or at least not clearly. He takes in whole sentences and produces one or two word responses which are a little slurred or clipped off. Speech is really hard, but recognition is coming along.

And that, I thought to myself, must be what they believed in those days. A computer is like a child, they thought. We teach it things and it responds as we have taught it. Children learn speech recognition first, before they can produce it well, so that's probably what will happen to computers, too.

The thing is, a computer is really nothing like a brain. We've lived with that metaphor for so long that we often get mad at our technology as if it were trying to thwart us. In many ways, though, a computer is the opposite of a brain. That's why we built them that way. It's why they became such useful tools: because they do things in ways our brains cannot. At their best, computers complement our abilities, and we both become better for it. So if you find yourself thinking about your computer or phone in brain terms, turn it around and think the opposite, as best you can.

Mokalus of Borg

PS - Assuming "the opposite" is clear.
PPS - Which it might not be.

No comments: