Object-based storage firm Cleversafe Wednesday announced the availability of the world's first 10 Exabyte storage system, which it said would allow companies to store "immense" volumes of unstructured data for Big Data analytics.
One Exabyte is the equivalent of 1 million Terabytes or 1 billion Gigabytes. One Terabyte can hold about 300 hours of video.
"Internet traffic volumes are increasing at a rate of 32 percent globally each year," said Russ Kennedy, vice president of product strategy, marketing and customer solutions for Cleversafe. "It's not unrealistic to think companies looking to mine that data would need to effectively analyze 80 Exabytes of data per month by 2015. To any company, data is a priceless component. However, it's only valuable if a company can effectively look across that data over time for trends or to analyze behavior and do it cost effectively. In its true sense, Cleversafe's limitless data storage solution is a critical foundational enable to Big Data analytics."
The storage system configuration is built with the same object-based dispersed storage technology that Chicago, Ill.-based Cleversafe has been developing since 2004. Unlike traditional RAID or replication-based storage, dispersed storage runs data through an information dispersal algorithm that breaks data into a number of pieces, or slices, each of which is then stored in a different place.
"One property is you break that data up into some number of pieces, each of which by itself is mathematically useless, so they're inherently private and secure," Chris Gladwin, Cleversafe founder, chairman and chief technology officer, explained in a video about the technology in 2007. "And then what you do is you store each of those pieces in a different location, typically on a different server. And if you want, you can put those servers in different locations. The other interesting property about information dispersal is you can perfectly recreate the data from just a portion of the slices. You don't necessarily need all those pieces of data."
A traditional storage system that relies on RAID arrays and mirroring stores multiple copies of each piece of data to ensure integrity and availability. One piece of data stored and mirrored can inflate to require more than five times its size in storage. Gladwin explained that the information dispersal technology uses a single instance of data with minimal expansion to ensure integrity and availability. This allows companies to save up to 90 percent of their storage costs.
The 10 Exabyte data storage system configuration uses that technology to allow for an independent scaling of storage capacity and performance through something Cleversafe calls a Portable Datacenter (PD), a collection of storage and network racks that be easily deployed or moved. Cleversafe said that in this particular configuration, each PD contains 21 racks with 189 Storage Nodes per PD and 45 3TB drives per Storage Node. Cleversafe explained that its geographically distributed PD model allows for rapid scale and mobility and is optimized for site failure tolerance and high availability.
The company's current configuration includes 16 sites across the US with 35 PDs per site and hundreds of simultaneous readers/writers.
"The exponential growth of unstructured data has brought the storage industry to a turning point," said David Reinsel, IDC group vice president of the Storage, Semiconductor, Pricing and GRC Infrastructure Group. "In order for companies to continue to protect their data assets and to extract value from their vast vaults of stored data, they must begin looking at technology alternatives beyond RAID in order to scale without limits. Cleversafe's ability to deliver a system capable of storing Exabyte levels of data today is a good example of the undertaking required to move forward and to solve the growing global issue of unstructured data."