Securing and protecting archival data

How do you secure data “at rest?” Step one: Don’t confuse archiving with backup.

Click here to enlarge image


By George Hall

Much has been written about digital network security and securing data in transit. However, relatively little has been written about securing data at rest, specifically archival data. In addition, there is confusion surrounding the difference between archival data and backup data. Equating the two is a common misconception that can lead to difficulties in business operations or in a litigation environment.

Archival data sets continue to grow rapidly, and the requirements for easy-but controlled and secure-access to these records have increased in complexity. New operating requirements have emerged that, if properly understood and implemented, will allow these data sets to grow exponentially without creating administrative or security nightmares.

Backup data is a subset of archival data and consists only of those items that are available to be backed up at the time each backup is accomplished. If data is deleted prior to a backup, then it is not backed up. Archival data, on the other hand, is a collection of all business records, including those that might have been deleted prior to a backup.

An increasing number of business litigation cases are demonstrating the negative financial consequences of failing to operate a secure and effective archiving system. This is particularly true today, given the record-retention requirements of regulations such as Sarbanes-Oxley. Even one e-mail message that a plaintiff has-and yet the defendant cannot produce-can change the outcome of the litigation. The ideal archival system should operate in the background, automatically capturing, cataloging, indexing, and securing every business document and communication without the involvement of users.

Storing and securing data

The simplest way to store and secure archival data is to segment it according to the company’s organizational structure, preserving and updating in the archival records the data relationships that exist operationally in the business. For example, each employee could be considered to be an archival unit. Many automated systems-particularly e-mail and Instant Messaging archiving systems-are tailored to this metric. However, this level of granularity in an overall business records archival system is likely to be cumbersome.

Beyond individual employee records, the best way to organize archival records is according to the “organizational tree” of the business. Storing digital data archives by department is the simplest way to demonstrate this.

For instance, the accounting department has a separate archive from the human resources department, but both departments report to the CFO. The best way to think about how to organize your archival data along departmental lines is to examine security requirements associated with viewing, recovering, restoring, or retrieving archival data.

For example, each department manager would hold separate data encryption keys. The accounting department manager owns key A, the human resources department manager owns key B, and the CFO owns both keys A and B because the accounting and HR managers report directly to the CFO. In some instances a system will produce a “master encryption key” for the CFO that opens all the doors under his organizational responsibility. This is not advisable for the simple reason that encryption key management should be self-supporting. This means that two people (and no more than two people) should have access to any one key for any given data set. Fortunately, key encryption can be easily managed, and arrangements can be made to ensure the encryption keys can be held securely in a number of ways.

This is an example of straightforward organization and management of archival data sets using encryption key management overlaid on an organizational structure, but it fails to address the single-biggest issue in the growing business litigation industry: the chain of custody, which in legal circles, refers to the ability of an organization to demonstrate that the data requested and produced has not been tampered with from the time it was created until the time it was delivered. The fact that no employee with access to an encrypted archive data set has tampered with it is, in many cases, no defense against an aggressive attorney intent on casting doubt on the integrity of material provided from a digital archive. The purposes most often cited that require a demonstrable chain of custody include legal discovery and government reporting requirements.

The best way to demonstrate that archival data has not been tampered with is to build digital data archives in such a way that they cannot be tampered with by any employee. This would necessarily include both the owners and creators of the data as well as the IT staff with access to the data archive. There are several methods of ensuring archival data is not modified once it has been written.

One method of ensuring that archival data is never modified is to write the data to a non-modifiable media-either optical or magnetic. Optical media has several attributes that should be considered when designing an effective digital archival system. First is shelf life. Most types of optical discs have been tested extensively for shelf life and bit-error rates (BERs) over time. Some manufacturers claim a shelf life of more than 100 years, but that’s just a marketing claim. Any media that can support archival data for at least 10 years is adequate.

The other issue to consider when evaluating optical technology is performance. Optical is not as fast as magnetic media, but speed will probably not be a significant issue.

Another method of delivering unalterable media is through the use of technologies that allow magnetic disk or tape drives to behave like write-once, read-many (WORM) drives. Write-blocking, or adaptive write-blocking, removes all of the write commands on a disk drive after each sector is written to. Some products do this with software, while others do it through hardware.

These write-blocking technologies have been submitted to the SEC for validation as an acceptable means of protecting digital archival data in conformance with the requirements of Sarbanes-Oxley legislation. Some of the technologies have been certified, and others are still pending.

BERs in magnetic/optical media over time can be a factor in the long-term quality of a digital archive. The degree to which this is an issue depends on what digital data is being archived. For instance, the loss of a bit or a byte of data in a word processing document may produce no noticeable changes in that document. However, the loss of a few bits or bytes in a high-resolution digital X-ray archive can have a much greater impact. In addition to the technical methods available for securing and protecting archival data, there are physical means of accomplishing this that should be considered as well, such as outsourcing online, automated archiving across a WAN to an external service provider.

There are several benefits to outsourcing archival to a third party:

The elimination of the need to build and support internally an automated archival system that is separate from your backup systems;

The ability to retain control over access to the data sets through encryption, even if they are no longer in your physical possession; and

The ability of third-party archiving providers to produce records for compliance reporting, discovery requests, etc., which eliminates the costs associated with doing this internally.

An effective plan for protecting and securing archival data should include an examination of a range of issues that are different from those traditionally associated with backup and restoration of digital data. From access authority to encryption, from physical location to media type, even the content types and volumes of the archival data will have an impact. Each business has unique archiving needs, and developing a security and protection plan for archival data includes forward thinking about a business’ future requirements.

George Hall is a member of Ridge Partners LLC, a consulting firm that specializes in storage, networking, and support services.

This article was originally published on May 01, 2006