Archival Data vs. Archive

There needs be more emphasis on a tier of storage not accessed much that must be constantly available. Archival data is data that you need to keep but you might not access either for long periods of time or ever. This type of data exists in many fields from Sarbanes-Oxley compliance data, to weather forecasting, to Landsat satellite images, to medical records, to my childhood pictures. None of this data is likely needed every day or even every week, but we still need to keep it.

It would be nice if there were a way to keep this data safe and secure (whether I am a home user or a company large site or government site) that scales, is easy to manage, and has a reasonable price. Archives today require people with HSM skills and a significant investment in hardware and software. But there is NO guarantee your data is safe, as no one I am aware of calculates the reliability of the data in the archive, which I know from firsthand experience is not easy to determine. I believe this is a problem ripe for solving. There are, of course, a number of problems that need to be addressed, such as the reliability that is the liability. Lawyers will have a field day in case of data loss, and there will be some data loss. No storage medium is 100% reliable, even with multiple copies.

The more I think about this problem, the more I think that we need to reset expectations for those that archive data and expect and demand it back from the archive.


Labels: archive,data storage

posted by: Henry Newman

Henry Newman, InfoStor Blogger
by Henry Newman
InfoStor Blogger

Henry Newman is CEO and CTO of Instrumental Inc. and has worked in HPC and large storage environments for 29 years. The outspoken Mr. Newman initially went to school to become a diplomat, but was firmly told during his first year that he might be better suited for a career that didn't require diplomatic skills. Diplomacy's loss was HPC's gain.

Previous Posts