Amazon, which helped popularize cloud computing, is now turning its attention to long-term cloud archiving with the company’s new Glacier service.
As its name suggests, Glacier is not meant for quick retrieval of files and backups. “Amazon Glacier is designed for data that is infrequently accessed, yet still important to retain for future reference, and for which retrieval times of several hours are suitable,” said the company in a brief announcement on the Amazon Web Services (AWS) site.
The company is not kidding about hours-long retrieval times.
According to Amazon, retrieval jobs can take three to five hours, and the data remains available for download for 24 hours. Glacier archives are protected by the 256-bit Advanced Encryption Standard (AES-256).
Storing data on Glacier costs $0.01 per gigabyte per month. Individual archives are capped at 40 GB. Although not overtly marketed as such, there are penalties for retrieving data too frequently from Glacier’s vaults or deleting data less than three months old.
Each month, the service allows AWS account holders to retrieve 5 percent of their Glacier data for free, but with a catch. “The allowance is pro-rated daily,” explains Amazon. Retrieval fees above the daily allowance start at $0.01. Additionally, Amazon charges $0.03 per GB for data deleted prior to 90 days.
As expected, users can upload data over the Internet. For large archives, Amazon suggests its AWS Import/Export cloud-seeding service, a sneakernet alternative that involves shipping to a portable storage device to Amazon.
As The Tape Whirs?
While Amazon is keeping quiet about the technology powering Glacier, some industry watchers like EMC data center guru Mark Twomey suspect that tape libraries are doing the heavy lifting.
Mark Twomey writes in his Storagezilla blog, “When you’re waiting for your 3 to 5 hour time to first byte window to pass that’s a robot picker waiting for a tape drive or drives to free up so it can load the required tape volumes and start reading them off to a disk staging area from where you then download your data.”
Twomey concedes that an all-disk infrastructure is possible and the lengthy data retrieval times can be attributed to aggressive compression and decryption. Despite Amazon’s silence on the issue, other cloud providers are moving aggressively to make tape systems a common sight in cloud archiving infrastructures.
In March, Fujifilm launched Permivault, a cloud archiving service where data resides on LTO5 tape drives and libraries. It’s a data-keeping strategy made viable by the Linear Tape File System (LTFS), an open tape storage format, and storage systems like Crossroads Systems’ StrongBox.
Strongbox is an appliance that sits atop tape libraries, providing buffering and caching to boost data ingest and retrieval performance. LTFS support allows storage administrators and end users to employ non-proprietary data management techniques, which eases file access and promotes data portability.