RealTime IT News

Future Search Will Eschew The Spider For The 'Ant'

LAKE BUENA VISTA, Fla. -- For years, enterprise search has relied on spiders crawling the Web in search of links to feed people information.

But business workers are hungry for more, said Gartner analyst Whit Andrews in a session on the future of search here Monday.

"How are we going to deal with the issue of getting access to the data that resides in the unusual, essentially dynamic sources such as business applications, such as frequently updated... databases," Andrews said.

"We have to look beyond the spider."

The limitation of the spider is one reason the Web will be colonized by Internet "ants." These ants, said Andrews, represent a change in the traditional information spiders weave in the Web.

Spiders are static -- they collect an image and its stuck.

But an ant, Andrews explained, represents a sort of colonized intelligence. Ants know the "proper path to travel to collect a fresh version of what may likely be relevant data."

Rather than crawl across a document set and store a pattern of what it finds, the ant finds the freshest morsel of data and return it to the colony where it can be merged with other morsels.

Andrews said this requires the search engine to spy on business applications to make determinations about where it must be prepared to go at the moment that a query is posted.

Ants will be important as the search sector moves forward -- imagine being able to use ants to extract information from your enterprise resource planning (ERP)  or supply chain management (SCM)  applications.

There are pros and cons, of course.

Spiders render stable but stale information. Ants yield fresh results but can also lead to applications overload, which will bog down systems.

The query model is changing, too.

While the model for search query interpretation has historically been to take a query, apply a relevancy method to it and then return results, queries will eventually be expanded to multi-stage processes in which relevant information can be fed back into the system to provide refined or expanded results.

This multi-stage relevancy approach dovetails with corporate vendors' current path for service-oriented architectures (SOA) , Andrews explained.

Enterprise search vendors lighting the path to innovation include Autonomy, Endeca and Fast.

Autonomy likes to offer holistic views of content objects; Endeca's strength is navigation; and Fast boasts a simple architecture and broad relevancy.

But the analyst cautioned us not to count out Google, whose OneBox relevancy algorithm currently operates in a sealed box with limited profiling and a rigid API .

Google's Search Appliance could make some interesting advances if it allows relevancy alteration modules to touch its OneBox API, Andrews said.

Google will eventually allow enterprises to incorporate a wide variety of dynamic data sets, including "live" sources that have been previously unavailable to Google.

For now, Google's bread-and-butter is clearly still serving the consumer.

The search giant Monday agreed to acquire online video purveyor YouTube for $1.6 billion in stock.