Scrubbing Content Metadata

Officials at content collaboration software maker Workshare unveiled a
software update to its metadata scrubbing software Monday, promising an end
to embarrassing leaks of private information.

Content metadata bears only a passing resemblance to the type of metadata
most people are aware of. Web metadata is keywords associated with a
Web site. By contrast, content metadata tracks a wide variety of information on documents
— previous revisions, hidden text, user e-mail addresses, server names and
routers. In all, there are about 18 to 25 different types of metadata found in
most Microsoft consumer and business products, namely its Office software
suite.

According to Workshare officials, collaboration via e-mail is becoming one
of the biggest new trends in the workplace. When a group
effort is required on a document, many employees will e-mail the latest revisions
rather than use a corporate portal.

Workshare officials
pointed to figures by some research firms that back up its contention.
Gartner Group predicts collaboration in the enterprise will grow
from 20 percent to 70 percent. IDC showed that 3.47 trillion business
e-mails were sent in 2003, a figure that will grow 40 percent yearly.

“It’s a dramatic shift, where 10 years ago we sat down and wrote documents
by ourselves,” said Amy Millard, Workshare vice president of marketing.
“Today, the main method of communication for documents is e-mail,
and, over time, what is contained in those documents grows.”

The problem, Millard said, is when companies send those documents outside
the workplace, into the hands of people who might not have the best
interests of the company in mind.

Take, for example, the law firm Boies, Schiller & Flexner, which
represents the SCO Group in its $5 billion
lawsuit against IBM .

When the law firm announced other companies getting named in a
related lawsuit,
Millard said, a reporter was able to look through the content
metadata and find names of other companies against which the firm was intending
to file a lawsuit. That wouldn’t have happened, she said, if the law firm was using Protect
3.0, which allows network administrators to set guidelines on what content
metadata found in Microsoft Word, Excel and PowerPoint can leave the
corporate intranet.

The same technology also removes metadata found in the
e-mail clients of Microsoft Outlook, Lotus Notes and Novell GroupWise.
Protect 3.0 also features a risk report feature, which scans a document and
shows the user what metadata is contained within.

It should come as no surprise that Workshare’s software is particularly
popular in the legal community. According to officials, the company’s
software is found in 98 percent of the Top 250 law firms in the United States, as well
as with many major corporations, like Microsoft, Wells Fargo Bank, KPMG,
Ingersoll-Rand and Nokia.

With pre-set policies in place governed by outgoing e-mail rules established
by the administrator, all Word, Excel and PowerPoint metadata is
automatically “scrubbed” from the document before it leaves the company.
The administrator can also set the policies so that every outgoing document
is converted to a read-only PDF file, which will remove the possibility of
anyone reading the metadata in a company document.

“It’s essentially security software, and to be successful it has to not ask
the end user to remember to do something,” Millard said. “People have the
best of intentions, but it’s a burden to require your end users to enact
your metadata policies.”

The company also launched a resource site for companies and individuals
looking for vendor-neutral information on metadata security, called
Metadatarisk.org, featuring white papers, best practices and the latest news
on the topic.

Get the Free Newsletter!

Subscribe to our newsletter.

Subscribe to Daily Tech Insider for top news, trends & analysis

News Around the Web