As part of a continuing effort to break down language barriers on the
Internet, the World Wide Web Consortium (W3C) has published a standard for
how applications can exchange and process characters from languages all over
the world.
The standards body wrote “Character Model of the World Wide Web:
Fundamentals” to provide specifications authors,
software programmers and content developers with a common guide for publishing
interoperable text on the Web. IBM, Microsoft, Sun Microsystems, webMethods,
Siemens, BBC and Boeing are working to develop the character model.
The W3C said the goal of the character model
is to “facilitate use of the Web by all people, regardless of their
language, script, writing system, and cultural conventions, in accordance
with the W3C goal of universal access.”
Such a model is important because the number of applications and users that
leverage the Web to conduct transactions or perform tasks is increasing.
The Computer Industry Almanac estimates there are roughly 934 million users online globally. The group
expects the number to top 1 billion people by the end of 2005.
The Universal Character Set (UCS) forms the core of the model, allowing Web
technologies to support text in the world’s scripts and to be exchanged and
searched by Web users around the world — and on disparate platforms.
The character model builds on the UCS furnished by the
Unicode Standard, which is a ubiquitous way of referencing characters independent of encoded text. W3C adopted Unicode as the document character set for HTML
The fundamental release of the character model is the first of three
documents W3C is planning for the cause. “Character Model for the World Wide
Web 1.0: Normalization,” and “Character Model for the World Wide Web 1.0:
Resource Identifiers” are in the works.