The continuing growth of the backup market coupled with the need to archive massive e-mail repositories has given rise to a new branch of storage — deduplication.
With messaging storage growing at 25 percent a year, and storage costs declining by about 20 percent a year, “We are heading towards the concept of ‘bottomless e-mail,’ with no hard constraints on mailbox size or requirements to purge older data,” says Donald Leaman, a consultant at GlassHouse Technologies. “Due to these factors, combined with the ability to greatly reduce the volume of messages backed up each week, the deduplication space should boom over the next one to three years and beyond.”
Already, there are numerous vendors offering products in this area. In addition to the usual suspects such as EMC, CA and Symantec, companies such as ADIC (in the process of being acquired by Quantum), Zantaz, Data Domain, Asigra, Diligent, Sepaton, FalconStor, Avamar, Atempo (courtesy of the acquisition of Storactive) and TimeSpring are also active in deduplication.
Glasshouse divides them into two broad categories: Targets for anybody’s backup software, such as Data Domain, Diligent, Sepaton and FalconStor; and Replacements for backup software, such as Avamar, Atempo and TimeSpring.
“With more than a dozen players in the vendor-space, determining the right fit for your organization is a daunting task when each vendor insists that ‘we do it better than the other guys,'” says Leaman.
Weighing in with some advice to further differentiate the various players is GlassHouse’s resident backup expert, Curtis Preston. He suggests that Data Domain has been in the market the longest and has the most real customers (just under 400 worldwide), though Avamar is the oldest in the second category. That said, he stresses that Sepaton is the most scalable and probably the fastest, while FalconStor has become everybody’s virtual tape library (VTL) by virtue of the fact that it is OEM’d by just about everyone.
Data Domain co-founder and vice president of product management Brian Biles says his company stands out by offering an appliance array target system for traditional enterprise backup software — with VTL and replication (for WAN vaulting) software options. He says it is the simplest and most aggressive dedupe tool, as well as being the least disruptive to current infrastructure.
“Vendors like Dilligent, Avamar and Asigra all break data into smallish, variable-sized segments and compare these for redundancy,” says Biles. “Other vendors now entering the space use a less-aggressive framework with more constraints, so they will tend to use more disk.”
He says some of these tools offer only modest benefits, not much more than are already offered by conventional backup software. Since Data Domain uses a smaller average segment than most, he says, it achieves the maximum possible redundancy elimination benefit.
While Data Domain may have the early lead, it better not rest on its laurels. Sepaton, for example, recently announced new deduplication technology it says can reduce storage capacity needs by a factor of 25 without affecting backup performance. This function has been added to Sepaton’s VTL appliances.
Meanwhile, Asigra has incorporated deduplication technology into its Televaulting Remote Office/Branch Office backup software.
“It is both local and global deduplication, and it is agentless, which makes it unique in the marketplace,” says Eran Farajun, senior executive vice president at Asigra.
What does he think of Data Domain and others that focus on dedupe as a major product?
“While various companies tout generic deduping products for both primary and secondary storage, Asigra believes that deduping is merely a feature in a product, and not a product in and of itself,” says Farjun. “These ‘one-feature-wonder’ companies’ technologies will be acquired and incorporated into an existing product to improve it.”
He reckons these acquisitions will occur within the next 12 to 18 months. He points to Rocksoft as the first casualty (acquired by ADIC), and claims continuous data protection (CDP) specialists will suffer a similar fate.
Data Domain’s Biles counters by classifying Asigra as merely clientless backup software with an added dedupe function available for servers. He suggests that Asigra deals with a smaller amount of data than normal backup software and is really for remote offices only. Further, he believes it requires augmentation or replacement of the existing infrastructure.
Not for Everyone
According to Biles, the dedupe market remains relatively small at less than $100 million this year. His expectation is that it will exceed $1 billion by the end of the decade.
“Dedupe is needed most in datacenters who want to minimize tape infrastructure, restore from disk, and vault over a WAN instead of trucking tapes,” says Biles. “A variant of this is remote offices; here, admin is unreliable and WAN bandwidth is small. Dedupe allows use of much smaller disk arrays and WANs, so business cases can be made against tape on a level footing.”
On the downside, he admits that this technology would not be of much value in any environment that is random or changing constantly, such as storing video feeds from surveillance cameras.
“Dedupe does not help conserve disk space here,” says Biles. “Similarly, GIS images from satellites are primarily non-repeating, so even though it’s a lot of data, this technology is not a good fit.”
GlassHouse’s Leaman agrees with Biles. He thinks most medium-to-large organizations should be considered prime candidates for e-mail archiving and deduplication.
“Significant administrative and logistical overhead are associated with large Exchange database stores, including timely backup and recovery, and the ability to defragment/compact the database at regular intervals,” Leaman says. “Certainly, any organization that is subject to regulatory compliance would be hard-pressed to find a reason not to include such technology as a part of their overall storage strategy.”
Preston suggests that the transition from tape to disk requires dedupe.
“We have many customers who have gone, or are going, to completely disk-based solutions for both backup and DR,” Preston says. “Anyone who wants to use disk needs deduplication. It’s essential for taking the cost out.”