By Kevin Komiega
A number of open source storage projects have popped up in recent months, most with a focus on developing heterogeneous storage management software. But the latest group to come on the scene is not aimed at building out the SMI-S storage management specification or creating low-cost backup and recovery tools. The Cleversafe project is challenging the conventions of traditional data storage with an entirely new approach to how companies and individuals store, encrypt, and manage information.
Cleversafe’s officials make some fairly impressive claims. The technology under development enables the use of low-cost, commodity hardware to store data; capacity scaling is theoretically limitless; and data loss due to outages is a near impossibility. So how do they do it, and why hasn’t anyone thought of it before?
Greg Rudin, Cleversafe vice president of sales and marketing, says the rapid adoption of high-speed Internet connectivity makes the concept of dispersed storage a viable alternative to traditional storage architectures.
“There is an antiquated approach being applied to storage today. The current methodology is based on storing multiple copies of the same data,” says Rudin. “You need ubiquitous adoption of high- speed connections to leverage the public Internet to move and store massive amounts of data. We are just now reaching that technology threshold.”
The Cleversafe project is based on the use of information dispersal algorithms (IDAs) and a grid architecture to divide data into encrypted “slices” instead of making multiple copies of the same information.
In its current state Cleversafe’s technology uses basic algebra to carve data into 11 encrypted pieces, each containing less than 10% of the original data. Each piece is then stored at a different remote data center. Rudin says this technique is inherently secure and reliable.
“This technology has been around since the 1970s and has been used to store things as sensitive as weapons launch codes. We’re utilizing that methodology on a mass scale to provide cost-effective storage,” Rudin claims. “The information is stored on 11 different servers in as many locations. We could lose up to five of those locations simultaneously and still recover the data.”
He adds that the data is secure even in the event of a security breach. A hacker accessing a subset or slice of data at one location would not have visibility into the rest of the data slices at the other 10 remote sites.
Cleversafe launched its initial test grid last April with remote locations across the US and Canada and plans to expand the project globally in the coming months. The ultimate goal is to create a “storage Internet.”
The Cleversafe Open Source Storage Community’s core of 20 full-time software developers is located on the campus of the Illinois Institute of Technology (IIT) in Chicago.
The commercial arm of the Cleversafe project hopes to begin offering software support and services to companies later this year, but the basic IDA technology for building and operating a small storage grid is available for free under a General Public License (GPL) at Cleversafe.org.
Depending on end-user interest, Clever-safe could revive the concept of the storage service provider (SSP). The first wave of SSP start-ups began to recede about the time the dot.com bubble burst. Some of them went out of business while others attempted to morph into storage management software companies.
“The Cleversafe concept could become a very disruptive model for distributing storage management software,” says John Webster, principal IT advisor at the Illuminata research and consulting firm.
Webster says Cleversafe’s grid architecture creates a virtual cloud of storage that evolves and changes over time through the addition and replacement of hardware. As long as the data is preserved, he says, the only thing end users have to worry about is whether they can read the bits.
Cleversafe does face an uphill battle. The healthy paranoia of having total control over all of your own corporate data locally is one hurdle facing the company. Also, the grid architecture is not necessarily a fit for meeting the storage requirements of transaction-based data where performance is paramount. But, according to Webster, where Cleversafe’s technology could make hay is as an alternative for long-term data retention and backup. “If you’re going to commit your data to the cloud and it essentially fulfills backup and archiving needs, why do you need to buy backup and archiving software? It seems to me that you don’t,” says Webster.