RealTime IT News

Group Shakes SALT on Scalable Vector Graphics

A group looking to bridge the chasm between speech technologies and computing devices Tuesday said it has fused characteristics from the Scalable Vector Graphics protocol to its own protocol for adding spoken access to applications from gadgets such as personal digital assistants (PDAs).

The SALT Forum presides over the Speech Application Language Tags (SALT) standard, a markup language that can be used to add speech recognition and synthesis and telephony capabilities to HTML or XHTML based applications, making them accessible from telephones or other devices such as PCs, telephones, tablet PCs and PDAs.

To enhance the capability of SALT, the Boston-based group said it has published a SALT profile for the World Wide Web Consortium's (W3C) Scalable Vector Graphics (SVG) markup language.

A standard gaining in popularity as vendors look to bring more content and ways of accessing it to gadgets, SVG renders graphics on displays of varying size and resolution. Because it features a lightweight design that reduces computational requirements, it is especially attractive to mobile device manufacturers since it was recommended last January.

By embedding SALT in SVG, developers can bundle interactive spoken interfaces directly to the visual interface, improving the user experience. For example, SALT characteristics would be embedded along with SVG in a voice-activated Web browser. The new SVG profile supplements the SALT 1.0 specification, which was contributed to the W3C by the SALT Forum and already included profiles for use with the XHTML and SMIL specifications.

SVG with SALT, then, makes it possible for programmers to build sophisticated mobile applications for these devices with easy-to-use speech interfaces that are accessible without looking at or touching the equipment. SVG with SALT can be used to provide speech "hot spots" within a graphic, or provide spoken commands for scrolling and zooming the display.

Said Redmonk Senior Analyst Stepehn O'Grady: "While the announcement today is not likely to mean voice everywhere quite yet - as there's quite a distance to go in making voice services more pervasive - it's another step forward. By allowing developers to more easily extend voice out towards the user interface, the new profile opens up the possibility of more reactive and responsive applications which can interact just on the basis of voice."

The SALT Forum, which consists of such vendors as Microsoft, Intel and SpeechWorks, has published its profile of the SALT-SVG marriage here .

Back story

Ideologically, SALT actually has its roots in another W3C specification, VoiceXML 2.0. That specification deals with building applications that bridge telephony and devices, but it doesn't address the "multimodal" aspects that the SALT Forum would later come to address when it formed in 2001.

Rob Kassel, respresentative of the SALT Forum and SpeechWorks, said SALT has its foundation in the W3C's attempt to provide voice access to Web services.

"When VoiceXML 2.0 came out, it became a widely adopted voice access specification using Web-based infrastructure, Kassel told internetnews.com. "But that was not the original goal, which was to voice-enable content" such as applications from mobile devices. Instead, without form interpretation algorithms, VoiceXML 2.0 became the standard way of building voice applications.

Kassel said members of the W3C realized VoiceXML 2.0 reused Web architectures, which was fine for scripting applications, but wouldn't satisfy the goals set to access all manners of content from new-fangled devices in a variety of ways. This multimodal access includes input with speech, a keyboard, keypad, mouse and/or stylus; and output as synthesized speech, audio, plain text, motion video and/or graphics. For example, the SALT Forum site explained a user might click on a flight info icon on a device and say "Show me the flights from San Francisco to Boston after 7 p.m. on Saturday" and have the browser display a Web page with the corresponding flights.

Conversely, VoiceXML was solely a telephony-based technology.

"Folks started saying how we can do a job to get closer to Web content to accommodate new devices, such as wireless PDAs," Kassel said. "VoiceXML was not practical for this. The problem with small devices is that when you build applications for them, they're not the same size as they are for full-size computers. So, if you have this small device and you want to access content using Web infrastructure, you have these small buttons. So you want a voice interface so you don't have to type. Ideally, you'd speak into it, but not even into it, you just have to have your device in a briefcase, and speak into it using in Bluetooth ."

Under those principles, Kassel said the SALT Forum was born. "The idea was to get companies really interested, hash out a standard over weeks or months and let W3C take it," he said.

SALT Forum is waiting for word from the W3C on SALT 1.0 and its new SVG profile proposal.