My friend David Rosenthal just put together an interesting blog to get ready for a meeting we’ll both attend on archival storage at the Library of Congress (LOC) this week (this year’s presentations from Designing Storage Architectures will be up next week).
It seems that some people in the state of Utah were finding that their optical archive costs were far greater than using AWS. David clearly and succinctly states that the notion of using Amazon for long term archives is a high risk. It is kind of sad that David has to write about why this is a bad idea as it seems obvious to me and most of the people I deal with that have long term archival data. Archives have totally different requirements than do systems that are designed for regular data access. As I said in my column this month, the issue with archives is not hardware limitations or design issues, but I am seeing software as the biggest problem for the long term.
The storage conference I help with at LOC focuses on different topics each year, but it has been a long time since the focus of the conference has been hardware. Yes, we get some presentations from a variety of hardware vendors but the challenges that seems to be the biggest problems are not hardware but software. Things like file formats, data fixity, and, in the past, hierarchical storage management systems. But last year and this year the new focus is on object storage, both from an interface perspective and of course from a long term reliability and management perspective.
The focus for future archives is not going to be about the storage but about meeting the requirements for the archive. AWS, as David points out, is not designed to address archive requirements. And as he also said, it is a mystery as to why it would be used. Today, we have tiers of storage today with different qualities of service, so I suspect it will not be long until we have active storage and archival storage.