Have you ever wanted to speak and hear what you said played back instantly in another language in your own voice? Now you can, according to Microsoft, which has just introduced technology purporting to do just that with English and Chinese. Microsoft says that the technology has a better correct-to-incorrect words ratio than previous attempts at automated translation. Plus, it talks in your own voice, or at least a synthesized simulation of it.
The tech is very impressive, and it was quite accurate in its translation of simpler sentences, but we’re afraid that the translation aspect is still not quite ready for primetime yet. Microsoft is well aware of this, but just for fun, I thought I’d point out a couple of the more blatant examples of areas where the translation robot went awry:
Original English sentence: “So now we’re taking the things that I’m saying and we’re converting them into Chinese text.”
Meaning of translated Chinese sentence: “So now the things we want I shouldn’t be saying, we put them in China.”
Original English sentence: “I’m speaking in English, and hopefully you’ll hear me speak in Chinese in my own voice”
Meaning of translated Chinese sentence: “I am am [sic] speaking in English, hope you can hear me speaking in a Chinese person’s voice.”
So, simultaneous interpreters aren’t out of a job just yet. With that said, the technology is still very impressive. I wasn’t totally sold on the hearing-your-own-voice thing — it still just sounded like robot speech to me — but I have to admit that the software got some sentences right that I wasn’t sure it would. Moreover, some of the errors in translation were really the result of errors in transcription, which means that the tech will get even better at translating when Microsoft can teach it to better understand the original English input. With a few more years of development, the technology could be ready for practical uses, although I don’t think I’d want to use it for anything to important until it has been quite thoroughly tested.
If you’re interested, we’ve embedded the video of Microsoft’s presentation below. The translation discussion begins around 6 minutes in, and the speech-to-speech translation occurs in the last minute and a half, for those who’d like to skip to the highlights: