SHARE
Facebook X Pinterest WhatsApp

Microsoft Beta to Make ’em Talk

Written By
thumbnail
Thor Olavsrud
Thor Olavsrud
Jul 9, 2003

Continuing its push into speech technology, Microsoft
Wednesday unleashed the first public beta of its Microsoft Speech Server,
and refreshed its Speech Application Software Development Kit (SASDK) with
a beta 3 release.

“Speech technology is on the cusp of reaching its full potential, and we
are committed to bringing it to the mainstream,” said Kai-Fu Lee, corporate
vice president of the Speech Technologies Group at Microsoft. “With the
beta release of Microsoft Speech Server and the beta 3 release of the
SASDK, we are making it easier for enterprise companies and their customers
to access information.”


Microsoft believes the value proposition of speech technology is clear: it
stands to reduce costs associated with call center agents. A typical
customer service call costs $5 to $10 to support, while an automated speech
technology system can lower that to 10 cents to 30 cents per call.
Additionally, speech technology can be used to give employees access to
critical information while on the move.

The new Speech Server, designed to run on the Windows Server 2003 operating
system, is a platform for speech application deployments. It is built on
the Speech Application Language Tags (SALT) standard, which defines a set
of lightweight tags as extensions to common Web-based programming
languages, allowing developers to add speech functionality to existing Web
applications, as well as to add prompt functionality to telephony and
multimodal applications.

Microsoft has brought partners Intel and Intervoice on
board to provide the server with a Telephony Interface Manager (TIM), which
provides integration of the Speech Server with the Intel NetStructure
communications boards, which allow for the deployment of speech processing
applications. Microsoft noted that multimodal applications don’t need TIM.


The key components of the new server are Speech Engine Services (SES) and
Telephony Application Services (TAS).

The SES includes:

  • Speech Recognition Engine, for handling users’ speech inputs
  • Prompt Engine, which takes prerecorded prompts from a database and
    plays them back to allow users to hear a human voice

  • Text-to-Speech Engine, which uses SpeechWorks’ Speechify engine to
    synthesize audio output from a text string when prerecorded prompts are
    unavailable.

    The TAS includes:

    • SALT Interpreter, which deals with all the speech interface and
      presentation logic, and also handles interactions between the speech
      application and the telephony components of the architecture
    • Media and Speech Manager, which handles requests made by SALT
      Interpreters to SES for speech recognition and prompt playback, and manages
      interfaces with the third-party TIM to deliver audio to and from the
      telephone user
    • SALT Interpreter Controller, which manages creation, deletion and
      resetting of the multiple instances of the SALT Interpreter that are
      managing dialogs with individual callers.

    “Microsoft Speech Server is unique to the marketplace in that it is the
    only speech server that supports both unified telephony and multimodal
    applications,” said Xuedong Huang, general manager of the Speech
    Technologies Group at Microsoft. “By building our speech technology
    offerings upon the open, industry-standard SALT specification, customers
    can use speech to access information from standard telephones and cell
    phones as well as GUI-based devices like PDAs, Tablet PCs and smart
    phones.”


    SASDK beta 3
    The software giant also refreshed its SASDK with a third beta Wednesday,
    updating the SASDK beta 2 released in October 2002. The SASDK is a developer tool based on SALT and designed
    to integrate with the Visual Studio .NET 2003 development environment. It
    allows developers to write combined speech and visual Web applications in a
    single code base.

    The new beta includes a host of new features, including:

    • Pocket Internet Explorer Bits, allowing Pocket PC access to Speech
      Server applications
    • Speech Application Wizard, which allows developers to create a new
      project in Visual Studio .NET 2003 that contains all necessary objects
    • Telephony Application Simulator, which simulates Speech Server to allow
      developers to deploy telephony applications on the desktop and interact
      with the application
    • Enhanced support for dual-tone multifrequency, or DTMF (the type of
      audio signals that are generated when you press the buttons on a touch-tone
      telephone)
    • Speech Application Controls, preset controls which manage responses
      containing digits and letters, like credit card numbers, expiration dates,
      currency amounts, ZIP codes and Social Security Numbers
    • Enhancements to Grammar Authoring, providing a flowchart view of
      grammars, the ability to type text for grammar phrases into grammar files,
      a Pronunciation Editor for unusual words, and integration into the Visual
      Studio .NET 2003 environment
    • Speech Controls Outline Panel, which consists of a dockable Visual
      Studio menu that shows users the sequence of controls in the speech
      application.

    Speech Partner Program
    Finally, Microsoft also raised the curtain on its Speech Partner Program
    (SPP), which is intended to provide additional revenue and profit
    opportunities to partners interested in developing, deploying or reselling
    enterprise-grade speech technology solutions based on Microsoft’s
    technologies.


    The software giant is targeting telephony value-added resellers and
    distributors, systems integrators, Web developers, independent software
    vendors, and Microsoft-certified partners with the program, giving them
    access to industry and Microsoft-specific events, access to special partner
    collateral (like advertising templates, sales tools and targeted
    demand-generation materials), discounted rates for Microsoft Speech
    Technologies training courses, placement in its SPP Resource Directory on
    the Microsoft.com Web site, and promotion of their products and services
    through Microsoft’s marketing efforts.

    To qualify, Microsoft said partners need to complete three training
    courses, including Speech Applications: Planning, VUI Design and
    Maintenance; Developing Speech-Enabled Web Applications Using the Microsoft
    Speech Application Software Development Kit; and Deploying and
    Administering Microsoft Speech Server.

  • Recommended for you...

    Oracle’s NetBeans Headed to The Apache Software Foundation
    Praise Be to the Dockercon 16 Demo Gods : Drink Espresso #dockercon
    Facebook Gets Serious about Open-Source
    Python 2 Gets New Security Features, Four Years After It was Supposed to Go Away
    Internet News Logo

    InternetNews is a source of industry news and intelligence for IT professionals from all branches of the technology world. InternetNews focuses on helping professionals grow their knowledge base and authority in their field with the top news and trends in Software, IT Management, Networking & Communications, and Small Business.

    Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

    Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.