Beyond the great wall of data on the Internet lies a goldmine for
enterprises called the Semantic Web.
Based on standards pioneered by the W3C, the Massachusetts Institute of Technology, Hewlett-Packard and a network of grassroots communities, the Semantic Web uses the Resource Description Framework (RDF) to piece together a variety of applications using XML for syntax and URLs for naming.
Simply put, the Semantic Web is changing how you interact with data and how you do business on the Web — by giving data more definition, or “meaning.”
The idea of the Semantic Web is to work with the Web as we now know it in order to make connections between metadata
in different formats such as a PDF file and an XML spreadsheet.
Beyond basic point-to-point object linking, the Semantic Web also allows for
recommendations based on people, places and concepts and lets applications
automate, integrate and reuse the data. Some call it the movement a vision, while others
note that the pieces are all there, but someone has to assemble the puzzle, and agree on standards to make sure all the pieces fit together.
Recently the W3C’s addition of the Web Ontology Language (OWL) as a candidate recommendation has provided a major stepping stone towards that as it lets developers begin designing its future.
Even though there are about 20 more potential standards left to sift
through, the minds behind the movement say the Semantic Web is here now.
Some predict it’s on the brink of mainstream adoption, which could happen as
early as the beginning of next year. Not convinced? Supporters say look no
further than the blogging
W3C Semantic Web Activity Lead Eric Miller, who is spearheading the
project, says bloggers are some of the first end users immersed in the
social network of the Semantic Web.
“Some of the tools here are things like TrackBack and syndication,”
Miller told internetnews.com. “If you use any consistent blogging
system, that system is available to RDF, you can leverage RSS tools and ask
questions like ‘show me all the people who are talking about grid
technology.’ What you get back is a more relevant response regardless of the
data set. It’s an interesting effect because it’s tying those rants together
in a cohesive way.”
Miller says there are also P2P JXTA
that take advantage of Semantic Web properties.
Automated phone systems are another area that is quickly adopting the
Semantic Web, according to Miller. Companies are designing networked systems
that are trying to eliminate the current maze of options.
“They are moving from a series of prompt requests — ‘If you would like
to do this press 1 or if you would like to do this press 2’ — to making
them more effective by setting up preferences and profiles to get more
streamlined.”
Miller also says the bio-iformatics and life sciences communities are
also some of the early adopters of the linking technology because it allows
researchers to reference documents more quickly.
“People are just now beginning to understand it,” said Web Ontology WG
co-chair Jim Hendler. “Companies that are just now getting their first taste
will really get excited once they find out one Semantic Web Internet can
talk to another Semantic Web Internet. My personal belief is my grandmother
will be able to take advantage of the Semantic Web but like HTTP, it will
run in the background so she will hardly notice it.”
Tim Berners-Lee, who along with Hendler and Ora Lassila first mentioned
the Semantic Web back in May 2001, told W3C members earlier this year that
the Semantic Web is going to be very powerful, and fun. There is going to be
constant tension between fragmentation and integration but ultimately, it
works over the entire scale.
“Once these services begin to participate in the Semantic Web, the
network effect is really exciting,” Miller said. “When it really hooks on,
there’ll be no way of stopping it.”
Major Players Line Up
As the Semantic Web begins to form, top-tier players are already
plotting their strategies for taking part.
Hewlett-Packard seems to have the best jump on the pack with its Jena
project. The company had the second version of its open source Java-based
framework released in late August. The company also is working on HP
Haystack, a flexible universal information client, which is a combination of
individual preferences used to interpret metadata.
Sun Microsystems has submitted its Global Knowledge
Engineering (GKE) framework (code-named SwoRDFish).
IBM recently released a best practices paper and a demo for its
own metadata interpreter, Semio Tagger. The addition to the Semantic Web lined a billion documents and
annotation server, which then can subscribe and get metadata from the server
to flip it around and show related documents.
“Right now it’s a pretty amicable environment. That will change,” Miller said.
Other major players in the Semantic Web space include Mitch Kapor of
Lotus 123 fame and his Open Source Applications Foundation (OSAF),
MovableType.org, Teknowledge and Creative Commons.org,
which is focused on protecting copyrights and privacy.
“That is an interesting and developing issue,” said Miller. “There are
potentials of misuse. We are helping people articulate this and their use in
policies in addition to the data that we’re sharing.”
With blogs for example, Miller says searching for related topics also
reveals metadata information about the authors, that and linking between
“friend of a friend” recommendations.
“It works very much like six degrees of separation,” Miller said.
The copyright issue is being addressed by setting up rules
“You as a writer or me as an artist is copyrighted except if we allow
that data to be used elsewhere,” Hendler said. “In some instances you could
reuse it but it requires attribution. When you type in a search in Google,
for example, you could in the future ask to show all the images that can be
reused for free.”
So if it can be used for protecting copyrighted images, could the
Semantic Web be a “silver bullet” that could potentially save online digital
music? Miller says no.
“Not a silver bullet, but useful gunpowder,” he said. “The systems for
Disney are going to be different than for libraries. The lowest common
denominator will be through digital rights description. Then there are the
generalizations of licenses and through rules in languages.”
In one field test, Hendler and his team at the University of Maryland
prototyped a Semantic Web application using a pager.
“A student wore a pager that would only let certain pages go through,”
Hendler. “We designed it so that professors could page her and others
couldn’t. In a much wider sense, these could be applied to metadata that
identifies a query. Not just an IP address but a certain type of criteria to
let me know this person is who they say they are.”