RealTime IT News

Berners-Lee Calls For More Voice Apps

NEW YORK -- World Wide Web creator Sir Tim Berners-Lee is challenging developers to do more with voice recognition systems and spur a sector that is ripe for innovation.

In town to deliver a keynote address at the SpeechTek conference, Berners-Lee said end users demand a seamless experience when dealing with voice-activated telephone systems. He also warned that frustration with voice recognition limitations could hurt the industry at a time when existing standards can help developers get past current voice recognition limitations.

Berners-Lee, who serves as a director of the World Wide Web Consortium (W3C), said voice technology firms must find ways to provide a good interpretation of mumbles and mangled phrases or even context for voice transactions.

"I'm a user. I call 1-800 numbers to get my washing machine fixed. It's important to me that it works properly," Berners-Lee said, recounting his own frustrations with a voice-activated system that did not recognize the word "yes."

"Generally, I'm impressed with what voice technology could do but when it can't understand that I'm shouting 'yes!' into the telephone, there are limitations. I eventually learned to say 'yup" and got my appointment."

He also suggested development work be centered around understanding the context of certain voice commands, especially when using voice technology to handle customer service queries.

Berners-Lee described voice recognition technology as a tough sector because of the inherent differences between natural languages and computer languages. "The natural language is soft, fuzzy and evolving but computer languages are hard and clearly defined. Speech technologies are trying to bridge the gap to help computers to figure out what people are saying and that's not an easy thing," he said.

"Computer recognition has to be just as good as a human brain," he said, arguing that the sophisticated use of voice technology will be driven by standards coming from the W3C.

He called on developers in the audience to get involved in the W3C's work to create specifications around a voice browser and multimodal interaction activity.

Berners-Lee also highlighted the work in the W3C Speech Interface Framework that recently published the Speech Synthesis Markup Language (SSML) 1.0 as a W3C Recommendation.

Practical use of SSML allows VoiceXML-based services to be accessed via text phones for people with speaking or hearing impairments. It is also aimed at helping software developers build applications for such gadgets as mobile phones and personal digital assistants (PDAs). It joins existing standards such as the W3C Recommendations VoiceXML 2.0 and Speech Recognition Grammar Specification (SRGS).

Berners-Lee said developers could expect W3C recommendations for InkML and EMMA (Extended Multimodal Annotation), both of which deal with speech and ink recognition technologies.

He also said voice technologies could be used to drive enterprise adoption of the Semantic Web, which treats the World Wide Web as one giant database that links human readable documents and machine readable data in a way useful to both mankind and machine.

Berners-Lee, one of the driving forces behind the idea of giving data more meaning through the use of metadata , said problems with integrating voice commands to existing back end databases could stunt growth in the voice technology space.

This is where the Semantic Web comes in, he argued, pointing out that voice recognition technology will benefit when applications start communicating with each other in a straightforward way.

"Talk to any CIO and they'll tell you what the problem is. It's the stovepipe where one application handles one area of business and another application does something else. And these applications aren't talking to each other. The problem of getting through the stovepipe is huge."