RealTime IT News
Semantic Web: Out of the Theory Realm
By Michael Singer
September 12, 2003

Beyond the great wall of data on the Internet lies a goldmine for enterprises called the Semantic Web.

Based on standards pioneered by the W3C, the Massachusetts Institute of Technology, Hewlett-Packard and a network of grassroots communities, the Semantic Web uses the Resource Description Framework (RDF) to piece together a variety of applications using XML for syntax and URLs for naming.

Simply put, the Semantic Web is changing how you interact with data and how you do business on the Web -- by giving data more definition, or "meaning."

The idea of the Semantic Web is to work with the Web as we now know it in order to make connections between metadata -- the data about the data -- in documents that can be related to other documents, which may be in different formats such as a PDF file and an XML spreadsheet.

Beyond basic point-to-point object linking, the Semantic Web also allows for recommendations based on people, places and concepts and lets applications automate, integrate and reuse the data. Some call it the movement a vision, while others note that the pieces are all there, but someone has to assemble the puzzle, and agree on standards to make sure all the pieces fit together.

Recently the W3C's addition of the Web Ontology Language (OWL) as a candidate recommendation has provided a major stepping stone towards that as it lets developers begin designing its future.

Even though there are about 20 more potential standards left to sift through, the minds behind the movement say the Semantic Web is here now. Some predict it's on the brink of mainstream adoption, which could happen as early as the beginning of next year. Not convinced? Supporters say look no further than the blogging community.

W3C Semantic Web Activity Lead Eric Miller, who is spearheading the project, says bloggers are some of the first end users immersed in the social network of the Semantic Web.

"Some of the tools here are things like TrackBack and syndication," Miller told internetnews.com. "If you use any consistent blogging system, that system is available to RDF, you can leverage RSS tools and ask questions like 'show me all the people who are talking about grid technology.' What you get back is a more relevant response regardless of the data set. It's an interesting effect because it's tying those rants together in a cohesive way."

Miller says there are also P2P JXTA framework systems such as Edgetella that take advantage of Semantic Web properties.

Automated phone systems are another area that is quickly adopting the Semantic Web, according to Miller. Companies are designing networked systems that are trying to eliminate the current maze of options.

"They are moving from a series of prompt requests -- 'If you would like to do this press 1 or if you would like to do this press 2' -- to making them more effective by setting up preferences and profiles to get more streamlined."

Miller also says the bio-iformatics and life sciences communities are also some of the early adopters of the linking technology because it allows researchers to reference documents more quickly.

"People are just now beginning to understand it," said Web Ontology WG co-chair Jim Hendler. "Companies that are just now getting their first taste will really get excited once they find out one Semantic Web Internet can talk to another Semantic Web Internet. My personal belief is my grandmother will be able to take advantage of the Semantic Web but like HTTP, it will run in the background so she will hardly notice it."

Tim Berners-Lee, who along with Hendler and Ora Lassila first mentioned the Semantic Web back in May 2001, told W3C members earlier this year that the Semantic Web is going to be very powerful, and fun. There is going to be constant tension between fragmentation and integration but ultimately, it works over the entire scale.

"Once these services begin to participate in the Semantic Web, the network effect is really exciting," Miller said. "When it really hooks on, there'll be no way of stopping it."

Major Players Line Up
As the Semantic Web begins to form, top-tier players are already plotting their strategies for taking part.

Hewlett-Packard seems to have the best jump on the pack with its Jena project. The company had the second version of its open source Java-based framework released in late August. The company also is working on HP Haystack, a flexible universal information client, which is a combination of individual preferences used to interpret metadata.

Sun Microsystems has submitted its Global Knowledge Engineering (GKE) framework (code-named SwoRDFish).

IBM recently released a best practices paper and a demo for its own metadata interpreter, Semio Tagger. The addition to the Semantic Web lined a billion documents and annotation server, which then can subscribe and get metadata from the server to flip it around and show related documents.

"Right now it's a pretty amicable environment. That will change," Miller said.

Other major players in the Semantic Web space include Mitch Kapor of Lotus 123 fame and his Open Source Applications Foundation (OSAF), MovableType.org, Teknowledge and Creative Commons.org, which is focused on protecting copyrights and privacy.

"That is an interesting and developing issue," said Miller. "There are potentials of misuse. We are helping people articulate this and their use in policies in addition to the data that we're sharing."

With blogs for example, Miller says searching for related topics also reveals metadata information about the authors, that and linking between "friend of a friend" recommendations.

"It works very much like six degrees of separation," Miller said.

The copyright issue is being addressed by setting up rules

"You as a writer or me as an artist is copyrighted except if we allow that data to be used elsewhere," Hendler said. "In some instances you could reuse it but it requires attribution. When you type in a search in Google, for example, you could in the future ask to show all the images that can be reused for free."

So if it can be used for protecting copyrighted images, could the Semantic Web be a "silver bullet" that could potentially save online digital music? Miller says no.

"Not a silver bullet, but useful gunpowder," he said. "The systems for Disney are going to be different than for libraries. The lowest common denominator will be through digital rights description. Then there are the generalizations of licenses and through rules in languages."

In one field test, Hendler and his team at the University of Maryland prototyped a Semantic Web application using a pager.

"A student wore a pager that would only let certain pages go through," Hendler. "We designed it so that professors could page her and others couldn't. In a much wider sense, these could be applied to metadata that identifies a query. Not just an IP address but a certain type of criteria to let me know this person is who they say they are."