How Google Plans to Clone You
Google last week acquired Phonetic Arts. The U.K.-based company specializes in technology that transforms a recorded voice into a computer-generated voice that sounds like the recording. In other words, it "captures" the tonal qualities, cadence and rhythm of how a real individual person talks, and applies them to a machine voice. The result is that a computer will be able to read any text, and it will sound convincingly like the original speaker talking.
Mon, December 06, 2010
Computerworld — Google (GOOG) last week acquired Phonetic Arts. The U.K.-based company specializes in technology that transforms a recorded voice into a computer-generated voice that sounds like the recording. In other words, it "captures" the tonal qualities, cadence and rhythm of how a real individual person talks, and applies them to a machine voice. The result is that a computer will be able to read any text, and it will sound convincingly like the original speaker talking.
Slideshow: Google's Top 10 Best (and Worst) Innovations of the Year
The voice Google wants to capture is yours.
Better than Star Trek
In Gene Roddenberry's original Star Trek TV series, the characters interacted with computers by talking, and the computers talked back. Although this was a breathtakingly advanced concept in the late 60s, it turns out that the real future is far more interesting.
In 60s sci-fi, voice interaction with a computer was generic, used for input and output, for commands and responses. It wasn't customized, and it certainly wasn't personalized.
A much more accurate fictional account of where voice interaction is going comes from William Gibson's 1984 novel "Neuromancer." In that book, people have virtual versions of themselves, represented by a 3D computer scan of the person's face, a computer-generated version of their voice, backed by artificial intelligence and data about the real person.
Gibson expanded on the concept in another novel, called "Mona Lisa Overdrive." In that work, people could record their personalities on storage media. "They respond, when questioned, in a manner approximating the response of the subject."
Google's vision is more Gibson than Roddenberry. Consider the following intersecting trends:
Synthetic voice is ready for prime time
Years ago, Microsoft demonstrated speech technology roughly similar to what Phonetic Arts has developed. In the demo, you could type any word or phrase, which would be read back to you in the voice of either Elvis Presley or Marilyn Monroe. Of course, Microsoft (MSFT) these days is the Xerox (XRX) PARC of the technology industry from a research and development standpoint -- it invents revolutionary technology, but can't seem to ship it.
Movie critic Roger Ebert was in the news earlier this year because he began using a synthetic voice that sounds just like his real voice. Ebert lost the capacity to speak in 2006 because of thyroid cancer. A Scotland-based company called CereProc captured recordings of Ebert's voice from TV and from DVD commentaries, then used them to generate a custom computer voice.
The technology is ready for prime time, and will only be improved in the future. Phonetic Arts is on the forefront, and now Google owns it.


