A group looking to bridge the chasm between speech technologies and
computing devices Tuesday said it has fused characteristics from the
Scalable Vector Graphics
adding spoken access to applications from gadgets such as personal digital
The SALT Forum presides over the Speech Application Language Tags (SALT)
recognition and synthesis and telephony capabilities to HTML or XHTML based
applications, making them accessible from telephones or other devices such
as PCs, telephones, tablet PCs and PDAs.
A standard gaining in popularity as vendors look to bring more content and
ways of accessing it to gadgets, SVG renders graphics on displays of varying
size and resolution. Because it features a lightweight design that reduces
computational requirements, it is especially attractive to mobile device
manufacturers since it was recommended last January.
By embedding SALT in SVG, developers can bundle interactive spoken
interfaces directly to the visual interface, improving the user experience.
For example, SALT characteristics would be embedded along with SVG in a
voice-activated Web browser. The new SVG profile supplements the SALT 1.0
specification, which was contributed to the W3C by the SALT Forum and
already included profiles for use with the XHTML and SMIL specifications.
SVG with SALT, then, makes it possible for programmers to build
sophisticated mobile applications for these devices with easy-to-use speech
interfaces that are accessible without looking at or touching the equipment.
SVG with SALT can be used to provide speech “hot spots” within a graphic, or
provide spoken commands for scrolling and zooming the display.
Said Redmonk Senior Analyst Stepehn O’Grady: “While the announcement today is not likely to mean voice everywhere quite yet – as there’s quite a distance to go in making voice services more pervasive – it’s another step forward. By allowing developers to
more easily extend voice out towards the user interface, the new profile
opens up the possibility of more reactive and responsive applications
which can interact just on the basis of voice.”
The SALT Forum, which consists of such vendors as Microsoft, Intel and SpeechWorks, has published its profile
of the SALT-SVG marriage here
Ideologically, SALT actually has its roots in another W3C specification,
VoiceXML 2.0. That specification deals with building applications that
bridge telephony and devices, but it doesn’t address the “multimodal”
aspects that the SALT Forum would later come to address when it formed in
Rob Kassel, respresentative of the SALT Forum and SpeechWorks, said SALT has
its foundation in the W3C’s attempt to provide voice access to Web services.
“When VoiceXML 2.0 came out, it became a widely adopted voice access
specification using Web-based infrastructure, Kassel told
internetnews.com. “But that was not the original goal, which was to
voice-enable content” such as applications from mobile devices. Instead,
without form interpretation algorithms, VoiceXML 2.0 became the standard way
of building voice applications.
Kassel said members of the W3C realized VoiceXML 2.0 reused Web
architectures, which was fine for scripting applications, but wouldn’t
satisfy the goals set to access all manners of content from new-fangled
devices in a variety of ways. This multimodal access includes input with
speech, a keyboard, keypad, mouse and/or stylus; and output as synthesized
speech, audio, plain text, motion video and/or graphics. For example, the
SALT Forum site explained a user might click on a flight info icon on a
device and say “Show me the flights from San Francisco to Boston after 7
p.m. on Saturday” and have the browser display a Web page with the
Conversely, VoiceXML was solely a telephony-based technology.
“Folks started saying how we can do a job to get closer to Web content to
accommodate new devices, such as wireless PDAs,” Kassel said. “VoiceXML was
not practical for this. The problem with small devices is that when you
build applications for them, they’re not the same size as they are for
full-size computers. So, if you have this small device and you want to
access content using Web infrastructure, you have these small buttons. So
you want a voice interface so you don’t have to type. Ideally, you’d speak
into it, but not even into it, you just have to have your device in a
briefcase, and speak into it using in Bluetooth
Under those principles, Kassel said the SALT Forum was born. “The idea was
to get companies really interested, hash out a standard over weeks or months
and let W3C take it,” he said.
SALT Forum is waiting for word from the W3C on SALT 1.0 and its new SVG