XML Finds Its Voice

The voice browser working group of the World Wide Web Consortium (W3C) gave
its seal of approval Wednesday for further consideration and testing of a
voice XML standard.

While the standard is still considered a “work in progress,” voice XML 2.0
has been considered stable and a candidate recommendation for approval as
an accepted standard in the XML community.

Worldwide adoption of the standard is good news to companies that use the
telephone to conduct surveys, take customer feedback and a host of other
telephone operations.

Using voice XML 2.0, a company is able to take the “yes,” “no,” “maybe,”
and even the “yeah” and “fine” comments and put them into a database,
dramatically streamlining the telephone support process. The new standard,
using the speech recognition specification, also handles dates like “June
29” and “yesterday.”

While that’s bad news for the humans who are currently employed in those
jobs, it’s great news for the businesses looking to further streamline
their data flow; it’s the first step in the complete automation of the
telephone process. Officials and experts expect the standard to improve
with age as more and more developers make use of the standard for their own

XML, like the HTML standard that goes into the design of Web pages on the
hyper-text transfer protocol (HTTP), is only as effective as the support it
gets from the development community.

Like any other computer language, voice XML breaks down what the human
wants into something the computer can use.

The voice browser working group gives the following example. Say someone
calls the local pizza parlor and makes a delivery order: “I would like a
medium coca cola and a large pizza with pepperoni and mushrooms.”

A possible semantic result produced under the voice XML standard would look
something like this:

  • {
  • drink: {
  • beverage: “coke”
  • drinksize: “medium”}
  • pizza: {
  • number: “3”
    pizzasize: “large”
    topping: [ “pepperoni”, “mushrooms” ]

    In order for the standard to pass muster with the W3C, voice XML 2.0 must
    meet a series of conditions:

    • Show that it works on at least two independent implementations of a
      required feature (six grammar processors and text parsers) and one
      implementation on an optional feature (three grammar processors).
    • Information must convert between XML and the augmented Backus-Naur
      form (ABNF).
    • Support the English language, as well as one European and one Asian
    • According to Dave Raggett, W3C voice browser activity lead, the standard
      has more than just a business aspect to it; with voice XML, it opens up a
      whole new world to people who might not otherwise be able to use the Internet.

      “People will be able to interact via spoken commands and listening to
      recorded speech, synthetic speech and music,” he said. “This will also
      benefit people with visual impairments or needing Web access while keeping
      theirs hands & eyes free for other things.

  • News Around the Web