RealTime IT News

Yahoo Wants Sites to Play Ball on Semantic Web

NEW YORK -- Looking at the development of the search industry over the last 10 years, it's remarkable both how much has changed and how much has stayed the same.

In his keynote address here at the final day of the Search Engine Strategies conference, Andrew Tomkins, chief scientist of Yahoo's (NASDAQ: YHOO) search division, described an industry at a tipping point. The search engines are only now beginning to adapt to the explosion of content and the increasing complexity of the tasks people perform on the Internet, he said.

"We've been doing a lot of thinking at Yahoo about where search is going," Tomkins said. "There's a lot more change now than, really, since the early days of the mid-90's."

Tomkins expounded on the goals behind Yahoo's recent pledge of support for several Semantic Web standards and open APIs for developers to annotate search results.

With such efforts, Yahoo seeks to address the fact that Internet searchers increasingly have more advanced needs, and a simple results page of ten blue text links often doesn't cut it, Tomkins said.

Yahoo's vision is for its search engine to be able to understand the task that the searcher is trying to accomplish.

Achieving task-focused search, Tomkins said, will require search engines to work with Web publishers to index more of the content buried deep within their sites. Yet Web publishers also stand to benefit from providing more detailed, actionable search results.

"Content consumption is fragmenting," Tomkins said. "Nobody owns more than 10 percent of page views; 10 percent happen on Yahoo, and that's the top."

Yahoo's immediate plans call for presenting search results with abstracts -- instead of a plain link with a sometimes vague, often truncated page title. This way, users will have an idea of what to expect from a page before they click on a link, he said.

Tomkins described how an abstract might help in searching for a restaurant. Instead of simply displaying the restaurant's name and link on the results page -- along with nine other Web sites with similar page names -- an abstract might display the restaurant's phone number, address, a photo and a user rating of one to five stars.

Such types of results might already be familiar to users of the major search engines. Google's Universal Search initiative yields results that include images, video and other formats, and regularly adds addresses, phone numbers and maps to its listings.

MSN Live has also broadened the results it produces -- the query "digital camera," for instance, will yield images and starred reviews, in addition to standard text links.

But working with publishers to add metadata to content could yield even deeper information, Tomkins said.

Newspaper publishers could tag articles with content synopses that would appear in the abstract on a results page. Searching for a medical term might produce an abstract containing summary information from WebMD, for instance.

Yahoo has already partnered with LinkedIn, so the information contained in a public profile can turn up in the abstracts that show up on a people search.

"This is the naïve beginning of trying to nibble into task-focused search," Tomkins said, describing the mission of what Yahoo calls its Open Search Platform.

Simplifying the task of understanding a user's intentions while searching and delivering deep, relevant information ties into Yahoo's Semantic Web efforts.

"A lot of the value of the ecosystem comes when we have an understanding of the semantics of the content," Tomkins said.

In working with publishers to change how search results are presented, Yahoo has explored several ways to index the depth of content buried within the so-called "long tail" of the Internet.

One way would be for publishers to embed Resource Description Framework markup, also called RDF, directly into their sites -- using either W3C's RDFa approach, or eRDF, an alternative HTML syntax. The result would tag content items like user reviews so that the search engine finds them.

Yahoo is also supporting several microformats , as well as Semantic Web vocabularies such as Dublin Core and Creative Commons. Then, too, publishers can set up RSS or Atom feeds to automatically package and send out their metadata to Yahoo's search engine for indexing.

If Yahoo is able to entice Web publishers into following its lead, the effort could help both parties catch up with the new ways people have come to depend on the Internet, Tomkins said.

"The complexity of the tasks we take on is just through the roof," he said. "We see exposing semantics as being the low-hanging fruit of unlocking that value."