Yahoo Wants Sites to Play Ball on Semantic Web

NEW YORK — Looking at the development of the search industry over the
last 10 years, it’s remarkable both how much has changed and how much has
stayed the same.

In his keynote address here at the final day of the Search Engine
Strategies conference, Andrew Tomkins, chief scientist of Yahoo’s (NASDAQ:
YHOO) search division, described an industry at a tipping point. The search
engines are only now beginning to adapt to the explosion of content and the
increasing complexity of the tasks people perform on the Internet, he said.

“We’ve been doing a lot of thinking at Yahoo about where search is
going,” Tomkins said. “There’s a lot more change now than, really, since the
early days of the mid-90’s.”

Tomkins expounded on the goals behind Yahoo’s recent pledge of support for several Semantic Web
standards
and open APIs for developers to
annotate search results.

With such efforts, Yahoo seeks to address the fact that Internet
searchers increasingly have more advanced needs, and a simple results page
of ten blue text links often doesn’t cut it, Tomkins said.

Yahoo’s vision is for its search engine to be able to understand the task
that the searcher is trying to accomplish.

Achieving task-focused search, Tomkins said, will require search engines
to work with Web publishers to index more of the content buried deep within
their sites. Yet Web publishers also stand to benefit from providing more
detailed, actionable search results.

“Content consumption is fragmenting,” Tomkins said. “Nobody owns more
than 10 percent of page views; 10 percent happen on Yahoo, and that’s the
top.”

Yahoo’s immediate plans call for presenting search results with
abstracts — instead of a plain link with a sometimes vague, often truncated
page title. This way, users will have an idea of what to expect from a page
before they click on a link, he said.

Tomkins described how an abstract might help in searching for a
restaurant. Instead of simply displaying the restaurant’s name and link on
the results page — along with nine other Web sites with similar page
names — an abstract might display the restaurant’s phone number, address, a
photo and a user rating of one to five stars.

Such types of results might already be familiar to users of the major
search engines. Google’s Universal Search initiative yields results that
include images, video and other formats, and regularly adds addresses, phone
numbers and maps to its listings.

MSN Live has also broadened the results it produces — the query “digital camera,” for instance, will yield images and starred reviews, in addition to standard text links.

But working with publishers to add metadata to content could yield even deeper information, Tomkins said.

Newspaper publishers could tag articles with content synopses that would
appear in the abstract on a results page. Searching for a medical term might
produce an abstract containing summary information from WebMD, for instance.

Yahoo has already partnered with LinkedIn, so the information contained
in a public profile can turn up in the abstracts that show up on a people
search.

“This is the naïve beginning of trying to nibble into task-focused
search,” Tomkins said, describing the mission of what Yahoo calls its Open
Search Platform.

Simplifying the task of understanding a user’s intentions while searching
and delivering deep, relevant information ties into Yahoo’s Semantic Web
efforts.

“A lot of the value of the ecosystem comes when we have an understanding
of the semantics of the content,” Tomkins said.

In working with publishers to change how search results are presented,
Yahoo has explored several ways to index the depth of content buried within
the so-called “long tail” of the Internet.

One way would be for publishers to embed Resource Description Framework
markup, also called RDF, directly into their sites — using either W3C’s RDFa approach, or eRDF, an alternative HTML syntax. The result would tag content items like user reviews so that the search engine finds them.

Yahoo is also supporting several microformats
, as well as Semantic Web vocabularies such as Dublin Core and Creative
Commons. Then, too, publishers can set up RSS or Atom feeds to automatically
package and send out their metadata to Yahoo’s search engine for indexing.

If Yahoo is able to entice Web publishers into following its lead, the effort could help both parties catch up with the new ways people have come to depend on the Internet, Tomkins said.

“The complexity of the tasks we take on is just through the roof,” he
said. “We see exposing semantics as being the low-hanging fruit of unlocking
that value.”

Get the Free Newsletter!

Subscribe to our newsletter.

Subscribe to Daily Tech Insider for top news, trends & analysis

News Around the Web