RealTime IT News

W3C OK's Speech Standard For Mobile Devices

Hoping to improve the way mobile devices handle the nuances of human communications, a leading standards body has approved a language for synthesized speech in Web interactions as a standard.

The World Wide Web Consortium (W3C) has published the Speech Synthesis Markup Language (SSML) 1.0, an important stepping stone along the path to improving communications over handheld devices.

In one practical use, SSML allows VoiceXML-based services to be accessed via textphones for people with speaking or hearing impairments. The document will also help software developers build applications for such gadgets as mobile phones and personal digital assistants (PDAs).

SSML 1.0 is a key component of the W3C Speech Interface Framework, a collection of rules for building voice applications for the Web. It joins existing standards such as the W3C Recommendations VoiceXML 2.0 and Speech Recognition Grammar Specification (SRGS), created by the W3C's Voice Browser Working Group.

The group, which includes Sun Microsystems , IBM , Intel and Microsoft , is working on two other specifications, Semantic Interpretation and Call Control XML (CCXML) to round out the Speech Interface Framework.

The specs will be used in concert to improve the development of applications based on languages such as Java, the leading platform for vendors who make Web-enabled handheld computers and mobile phones.

Such devices have several limitations, from the difficulty of manually operating them because of their small form factors, to the lack of quality rich media applications. Members of the W3C, many of whom work for companies building the software or hardware products they aim to improve, announced the news ahead of next week's SpeechTek Conference in New York.

"I am excited about the progress the Voice Browser Working Group has made in providing improved access to services over the telephone through the use of Web technologies," said W3C Director Tim Berners-Lee, who will be delivering a keynote address at the SpeechTEK Conference next week. He added, "companies can now offer Web access to their customers via the telephone as well as from a personal computer."

Max Froumentin, spokesperson for the Voice Browser Working Group, said SSML describes how a speech synthesizer will pronounce something.

"Most speech synthesizers just take text and send them and pronounce those texts," Froumentin told internetnews.com. "But with languages, and English in particular, you can't really know the pronunciation from the text. What SSML does is add a marker for the text that says 'this is pronounced this way, this is pronounced that way, this is a male voice, this is a female voice' and all sorts of controls."

SSML will have broad applicability in the real world; Froumentin said speech synthesis is currently employed in call centers of banks.

"If you call and ask for your credit card statement they can use speech synthesis to tell you how much money there is," Froumentin said. "At the moment, most voice services use pre-recorded prompts, but that wouldn't work if it's trying to tell you how much money is in your account. That's where you can use SSML for speech synthesis."

Vendors and working group members that have already implemented SSML include HP , IBM, Microsoft, SAP and Sun.