Sunday, October 16, 2011

Computer speech recognition and voice-user interface

A voice-user interface (VUI) is an interface which enables a human to communicate with a computer through his voice (speech). Today this probably means giving simple instructions or/and going through a automated process through your phone. But why? The potential for communicating with your voice is huge! It is the way humans communicate with each other so why couldn't this be the way we communicate with computers and electronic devices as well?



Well, for a start the way a computer understands the speech is by analyzing the digitalized waveform into its frequency components. Then compares this spectrogram pattern to known patterns for particular sounds. This is a very complex thing to do with so many errors that can happend, for examples heavy accents will easily be misunderstood.

If Computer voice recognition became good enough (meaning even better than a keyboard) we will surely see a huge change in how different devices looks. Can you even imagine removing the keyboard and computer mouse? You probably can, since Apple already took a step in this direction with their iPad, thanks to the multi-touchscreen. Perfect voice recognition would take this process even further.

However, one of the latest progressions in this field is also something from apple, Siri, the new voice assistant in iPhone 4S. This is far from perfect, but still very useful in everyday life. Check out this video to see how it (she) works:



Even though computers and electronic devices eventually will understand perfectly what we say, it is only 7% of our communication that comes from what we actually say, the rest is body language and how we sound. But wouldn't it be a bit sad if our computers would understand our ironic shouts and angry gesture's towards them?

References:

http://www.voiceworks.co.za/
http://en.wikipedia.org/wiki/Voice_user_interface

No comments:

Post a Comment