RealTime IT News

W3C to Break Web Language Barriers?

As part of a continuing effort to break down language barriers on the Internet, the World Wide Web Consortium (W3C) has published a standard for how applications can exchange and process characters from languages all over the world.

The standards body wrote "Character Model of the World Wide Web: Fundamentals" to provide specifications authors, software programmers and content developers with a common guide for publishing interoperable text on the Web. IBM, Microsoft, Sun Microsystems, webMethods, Siemens, BBC and Boeing are working to develop the character model.

The W3C said the goal of the character model is to "facilitate use of the Web by all people, regardless of their language, script, writing system, and cultural conventions, in accordance with the W3C goal of universal access."

Such a model is important because the number of applications and users that leverage the Web to conduct transactions or perform tasks is increasing.

The Computer Industry Almanac estimates there are roughly 934 million users online globally. The group expects the number to top 1 billion people by the end of 2005.

The Universal Character Set (UCS) forms the core of the model, allowing Web technologies to support text in the world's scripts and to be exchanged and searched by Web users around the world -- and on disparate platforms.

The character model builds on the UCS furnished by the Unicode Standard, which is a ubiquitous way of referencing characters independent of encoded text. W3C adopted Unicode as the document character set for HTML in HTML 4.0.

The fundamental release of the character model is the first of three documents W3C is planning for the cause. "Character Model for the World Wide Web 1.0: Normalization," and "Character Model for the World Wide Web 1.0: Resource Identifiers" are in the works.