Enterprise Search And Destroy

Michael HickinsReporter’s Notebook: New government regulations often spawn whole new markets. A far-reaching reform of the Federal Rules of Civil Procedure (FRCP) is proving to be no exception.

The reform means that electronic documents in all forms, including e-mail,
instant messages and even transcripts of video conference and VoIP
 calls, are fair game for litigants during the discovery
phase of a lawsuit. This has given enterprise search application vendors
significant new momentum.

However, allowing customers to find relevant documents among
billions of pages is no mean feat: The CEO of one vendor said he told a
prospective customer that it would take between 200 and 300 servers (in
addition to the ones already in place) to index some 300 terabytes of data.
He was informed that his competition had quoted somewhere in the range of
1,000 servers.

According to IDC analyst Sue Feldman, the field of electronic document
search and retrieval has “gone from practically zilch three years ago” to
achieving 30 percent growth over the past year, in large part thanks to the
changes to the FRCP.


“Text mining is growing at an enormous rate,” she told me.

No wonder that established enterprise search application vendors like
Autonomy, Recommind and Exalead have introduced new products or tweaked
existing software to meet demand for the efficient storage and retrieval of
electronic communications.

The reform has also given new life to specialized
vendors, such as Toronto-based Nstein, Palo Alto, Calif.-based Attensity,
Attenex, based in Seattle, and Nexidia, based in Atlanta.

The reform, which took effect on Dec. 1, has also spawned a rash
of new partnerships and acquisitions between traditional enterprise storage
companies, such as IBM , EDS , EMC  and CA nbsp;on the one hand, and the document
retrieval specialists on the other.

For instance, Exalead CEO Alain Heurtebise told me that his Paris-based company closed a deal with EDS’s Italian subsidiary in 2006, and is being considered by IBM as an OEM partner for a storage and e-discovery application.

E-mail storage vendor Zantaz also recently bolstered its feature set with
the acquisition of data-classification vendor Singlecast.

Paris-based business intelligence vendor Business Objects  picked Attensity to be its search partner in November,
while Nstein has recently signed deals with Cognos  and Computer Sciences Corp. .

E-discovery application vendors promote their own special
sauce for allowing corporate lawyers to sift through reams of data while
de-duping and otherwise reducing irrelevant search results.

The challenge with searches in this context is that it isn’t easy to know
what you’re looking for. For instance, legal beagles wouldn’t have known to
search for terms such as “manipulate the California energy market” at Enron
without the benefit of hindsight.

According to Heurtebise, Exalead addresses this problem through what he
called “serendipitous search,” which allows customers to refine or redirect
their searches based on an initial set of results. “The initial result sets
could give you intelligence and insight about the right question to put. If
you’re ignorant about what you’re looking for, you’re obliged to go by
serendipity,” he said.

Next page: Digging into the words

Page 2 of 2

Michael HickinsReporter’s Notebook: Autonomy, based in San Francisco and Cambridge, England, has taken a more
holistic tack, and has married its enterprise search application with a
corporate-compliance module that analyzes enterprise information
repositories in real time, flags high-risk content and behaviors that may
violate compliance policies, and provides an audit trail.

San Francisco-based Recommind uses a technology called probabilistic latent
semantic analysis, which looks at words in relationship to each other in
context. The newest version of its search platform includes the ability for
customers to lock down given documents so that they cannot be edited or
destroyed.

The company has also made the application more attractive to potential
partners by improving its interoperability with other document-management
application vendors.

“It’s OEM-ready,” in the words of Craig Carpenter, the company’s vice
president of marketing.

A more specialized application from transcription software vendor Nexidia
lets users search voice files (from VoIP calls, for instance) using
phonemics, which recognizes differences in regional accents (like the Boston
accent that drops the “r” in car).

The market for these products is growing so quickly because of the key role
played by discovery in the U.S. legal system: It forces plaintiffs and
defendants to put their cards out on the table, forcing a settlement one way
or the other. As a result of this process, fewer than 2 percent of federal
cases ever go to trial, according to Cliff Shnier, vice president of the
litigation group at Aon Consulting.

Companies that fail to provide these documents in a timely manner face the
wrath of the courts: Morgan Stanley was saddled with a $1.45
billion settlement, in large part because the judge in the case instructed
the jury to take the broker’s failure to provide electronic documents as
proof that the missing information was damaging to its case.

Several courts had already issued rulings with regards to electronic
documents, but awards of the magnitude of the Morgan Stanley case prompted
Congress to amend the FRCP to give those precedents force of law.

Changing the FRCP isn’t something that happens very often; “since 1938 [when
the FRCP was initially passed], the number of times it has been amended can
be counted on the fingers of both hands,” Shnier said.

The extent to which these rules apply to a company depend to a large extent
on the industry in which it operates. For instance, brokers and other
financial institutions are required to store voice transcripts, while
construction companies are not.

Shnier added that companies are being encouraged to use technology in order
to reduce the costs associated with discovery, and to make searching more
productive.

“The courts have smiled on the idea of using search technology to cull this
down,” he said.

Moreover, companies don’t have to maintain all their documents forever.
According to Shnier, “it’s highly recommended that
companies have records retention policies” mandating the purging of
documents after a given number of days, unless they are deemed potentially
discoverable.

“When there’s a potential for litigation, the duty to preserve does arise,”
he noted. That said, “an advisable retention policy gets rid of all inactive
documents.”

News Around the Web