July 8, 2009 – Data reduction, or capacity optimization, has succeeded in the backup/archive space (i.e., secondary storage), but applying data reduction techniques such as a deduplication and/or compression to primary storage is a horse of a different color. This is why the leading vendors in data deduplication for secondary storage (e.g., Data Domain, EMC, IBM, FalconStor, etc.) are not the same players as we find in the market for data reduction on primary storage.
A lot of articles have been written about primary storage optimization (as the Taneja Group
consulting firm refers to it), but most of them focus on the advantages while ignoring the ‘gotchas’ associated with the technology. InfoStor (me, in particular) has been guilty of this (see “Consider data reduction for primary storage”
In that article, I focused on the advantages of data reduction for primary storage, and introduced the key players (NetApp, EMC, Ocarina, Storwize, Hifn/Exar, and greenBytes) and their different approaches to capacity optimization. But I didn’t get into the drawbacks.
In a recent blog post, Wikibon.org
president and founder Dave Vellante drills into the drawbacks associated with data reduction on primary storage (which Wikibon refers to broadly as “online or primary data compression”).
Vellante divides the market into three approaches:
--“Data deduplication light” approaches such as those used by NetApp and EMC
--Host-managed data reduction (e.g., Ocarina Networks)
--In-line data compression (e.g., Storwize)
All of these approaches have the same benefits (reduced capacity and costs), but each has a few drawbacks. Recommended reading: “Pitfalls of compressing online storage.”