Welcome to the wide wide world of data archiving, life beyond compliance.
With the wide wide world of archiving, life beyond compliance, the focus expands to general-purpose use across all applications – not just regulatory, healthcare or financial. For some, this may be a deja vu moment to the past when archiving was used as a general purpose tool for managing storage space to support growth, while reducing backup and data protection time (and costs). On the other hand, others may see this expanded focus of archiving as a new and powerful technique that is part of data footprint reduction (DFR).
DFR is a collection of tools, technologies and techniques for reducing the impacts of an expanding data footprint in various locations, across different applications and data types. Some examples of DFR technologies, tools and techniques include archive, compression, consolidation, data management and dedupe among others.
Historically, data archiving has been used as a primary tool for reducing the amount of data in active or primary storage that needs to be preserved and protected. Likewise today, DFR is primarly thought of by many to be just about deduping, which is important; however keep in mind not all data can dedupe.
Additional archiving perspectives:
• Traditional archive targets or destination media and systems include tape and optical, which is still around, however its use is declining. Tape remains relevant on-site as well as off-site and within cloud services. Disk based archives have increased in popularity including block, file and object access. Object access includes AWS S3, CDMI, DICOM, HL7, REST, among other API and programmatic bindings.
Some examples of solutions, spanning disk, tape and cloud, include Amazon Web Services (AWS) Glacier, Amplidata, Caringo, DDN, Dell, EMC, HDS, HP, IBM, NetApp, Oracle, Quantum and Spectra among others. Software tools include those from DataDynamics, EMC, HP, IBM, SGI (FileTek and DMF) and Symantec among many others.
• Speaking of applications and software tools, look into how various archiving solutions plugin into your applications from database to email, along with SharePoint, among others. Expand focus to include insight tools, also known as storage resource management (SRM), to gain insight into what you have, how it is being used as part of a broad discovery process (not to be confused with eDiscovery or legal search).
There are simple operating system or applications tools, as well as free or low cost solutions that allow you to gain basic insight. In some cases these basic tools can also help justify the need for acquiring more extensive and expensive tools to dig further, to help identify what to archive and when.
• In addition to saving your data, how about saving the software or applications along with their settings? In other words, even today data from tape can be restored from a decade or more ago and placed onto disk, yet the software or applications may not be able to read it. So save the applications as well, a relatively easy approach being to virtualize a system with the software, then save the virtual machine.
• Determine how, when, where, why and with what, if applicable, cloud services will be used in your environment. This also means identifying your concerns and then assessing if those can be addressed, worked around or result in a barrier for cloud archiving.
• Investigate how virtual machines (VMs) or virtual servers can be used as a means to capture, protect and preserve the context of the application, including software, settings and data. This means in addition to saving data, also saving VMs such as VMware VMDKs or OVA/OVFs for later restoration of the system, application software, settings and data. Granted there may be separate archives for different periods of time-based data.
• Find an opportunity to start archiving that can result in success to build upon for a bigger broader initiative (the next phase). There is a temptation to try to do everything (e.g. boil the ocean), however walk before you run, take the time to build success and experience, not to mention support of management and others.
Here is a key point: While archiving is a powerful and effective technique and should be part of your data footprint reduction strategy, it does take time and effort to implement. However, that time and effort can also lead to big savings that will also make your other downstream data protection and reduction efforts more effective. Not all yet most can be archived. The question is when, how, with what and where to implement a strategy beyond regulatory compliance focus. Then there is the other benefit, which is if you can reduce your data footprint for active data, which may also make it smaller and more portable in case you need to travel to the cloud, or elsewhere.