SANTA CLARA, Calif. — The promise of the Semantic Web, where users can make
natural-language queries and get accurate, timely results, is starting to be
realized, insiders say.
But while that’s good news considering the buzz long surrounding the Semantic Web
Scott Prevost, the principal development manager for Microsoft’s
(NASDAQ: MSFT) Bing search engine, said publishers have seen mixed results by being forced to “semanticize” content — structuring it to make it more amenable to search engines. Steps include favoring keywords and other changes in the way content is presented to make it more attractive to search engines.
But the process can sometimes detract from meaning, flow and readability, he said,
“It’s ridiculous how some content is chock-full of keywords,” Prevost said in
his keynote here at the Web 3.0
conference. “It’s become easier to find and harder to understand.”
But now, pointing to Bing and the efforts of others, Prevost said technology is primed to meet publishers more than halfway. He said sophisticated search engines like Bing are doing a better job of understanding structure and recognizing the meaning behind words based on context and other factors to recognize, to use a simple example, when a search for “Apple” refers to the computer company and not the fruit.
Another key area of research is the presentation of results. Google (NASDAQ: GOOG), Bing, Yahoo (NASDAQ: YHOO) and a host of others are all scrambling to move beyond the classic “ten blue links” of search results on a page to provide more relevant information and entry points, reducing the number of clicks users have to go through before they’re gotten the information they seek.
Part of that comes down to providing visual results — like a selection of cameras and pricing — in response to someone shopping for a camera. But even providing the right amount of text in the teaser or caption that accompanies a link result requires a lot of science.
Prevost notes, for example, that a result for health-related search might show
the text: “85 percent of the population suffers from omega 3 fish oil …”
But “deficiency,” the one extra word dropped from the abbreviated search result, changes the sentence’s whole meaning. He noted that in many Bing results, users have the option to hover their mouse pointer over the line to see more of the content, like an extra paragraph and additional relevant links from that page, without
having to click through to it.
Near term, Prevost said Bing is making the most progress related to semantic search, working with vertical areas that have a lot of structure already. For example, last week Bing announced that it’s pulled together results from a number of popular recipe Web sites to better address recipe-related queries.
“People need to see the benefit, and then we can get to critical mass. I think we’re just about there,” Prevost said. In the case of recipes, he said the Web site publishers were willing to share how they structure their data, “because it really pays off to have their results presented in a more coherent way. Publishers are willing to do more to make the ecosystem work.”
Prevost said that while search engines like Bing and others will do the “heavy lifting” of identifying the best results, it will remain a combination of Web sites that curate information and structure content in ways that help the search engine.
On the other hand, Bing and others are moving quickly to provide real-time results from social networks and microblogging sites like Twitter, where the search engine definitely has to do most of the heavy lifting.
“It’s an area where semantic understanding is really more important because you don’t get much of a signal from a 140-character message,” Prevost said.
A Kosmix solution
Kosmix is one of many search players that offers new approaches to presenting results. Like Bing’s recipe idea, Kosmix has been categorizing types of results for quite a while.
“We’ve been going after the Semantic Web for a couple of years ahead of others,” Abhishek Gattani, a principal architect at Kosmix, told InternetNews.com during a session break. The Semantic Web isn’t coming with one big bang. It’ll be up to companies like us to understand content and take the onus off the publishers of having to describe all their data.”
While search has traditionally been about ranking documents, Gattani said
it’s shifting to be more “about users and what they publish and consume, and
how authoritative their information is.”
Currently, well-funded content providers that can afford to invest in
search engine optimization and similar technology have an edge when it comes
to a presence on search engines. But Prevost said that as search engines get
smarter in what they extract, it will level the playing field, opening up
more opportunity to smaller publishers and content providers.