DPM tools have evolved from narrow reporting on backup jobs to expansive control of the data-protection environment.
By Christine Taylor
August 13, 2008 -- Managing an effective enterprise backup environment can be an extreme challenge, and data-protection management (DPM) tools are necessary to meet that challenge. DPM software sits in a layer above the backup applications and storage devices where it overlays centralized process control on top of the backup-and-recovery infrastructure. From this vantage point it collects and integrates backup data and reports it, enabling IT to test service levels and better optimize the backup process across heterogeneous systems and multiple backup applications.
In 2005, DPM software first hit the scene with basic backup reporting. Even though these tools were limited in scope by backup application and domain, they gave immediate value by providing visibility into backup-and-recovery failures and slowdowns. This capability immediately helped to relieve acute operational backup problems and caused DPM to emerge as one of the fastest growing segments of the multi-billion dollar data-protection market.
Fast-forward several years. Data growth continues unabated, and IT is taxed to match this huge volume of data to the business applications the data should support. Data-protection technologies have grown right along with the data and many enterprises have adopted virtual tape libraries (VTLs), multiple backup systems, native array-based protection, and replication that operates across systems and hosts. Consequences include backups running longer than their allotted windows, non-optimized storage infrastructure, and outright backup-and-recovery failures.
This is where the next evolution of DPM comes in. DPM 2.0 offers the ability to proactively optimize data-protection total cost of ownership, to set and maintain service level-driven protection policies, and to provide comprehensive protection management matched to business value.
Virtual tape libraries: VTLs are growing in popularity as a way to present a tape-based interface to the backup application with the speed of disk at the back-end. But this very popularity has led to an urgent problem in backup environments: an unmanageable sprawl of VTLs throughout the enterprise, which presents challenges to data protection and makes it difficult for backup administrators to manage VTL-based backup processes. The consequences are serious, including narrow backup bandwidth, increased IT overhead, poor manageability, and multiple points of failure.
Numerous backup systems: Today’s enterprise is characterized by multiple backup systems. For example, there might be seven regional centers and 100 or more remote offices/branch offices (ROBOs) in the distributed enterprise, with no one environment running an identical backup system. Even with a single building or campus, workgroups and departments commonly have their own backup system, while the data center alone might host dozens of them.
Array-based protection mechanisms: Some storage arrays come with massive amounts of native data-protection capabilities. Unfortunately, there is little way to manage that intelligence within the context of the entire data center. Array-based backup and replication may report operations to the administrator, but there is no centralized view or way to centrally manage backup across the data center. Rather, administrator must view reports from multiple backup systems and arrays.
Cross-system and cross-host replication: An even thornier problem is replication that occurs across domains. This type of reporting is well out of the realm of DPM 1.0 tools, which were suited to reporting traditional backup application operations within single domains. Yet replication is often reserved for critical data that must be kept immediately available, and IT has little way of providing a matching level of visibility and control.
System complexity: Even single backup systems can be highly complex, with multiple hosts, storage targets, connections, and data-protection technologies such as continuous data protection (CDP), snapshots, or replication. When an inefficiency or failure point is introduced into just one element, the entire system can slow down or fail. This plays havoc with service levels and with IT resources, as it can be very difficult to pinpoint the cause of the service interruption in a large and complex backup environment. And even when IT locates and fixes the problem, there is no systematic method to lessen risk.
To meet these new challenges DPM must evolve into a next-generation toolset offering new levels of cross-domain visibility, granular control, and predictive analysis. That evolution has begun.
Enter DPM 2.0
The purpose of DPM 2.0 is to alleviate the operational issues associated with the backup process. DPM 2.0 has not lost its backup reporting roots; providing visibility into the backup process remains a fundamental capability of all DPM products. But reporting is no longer enough. DPM 2.0 adds wider and deeper visibility, along with strong levels of control. Predictive analysis, cross-domain functionality, and data protection beyond backup all provide customized protection based on service levels and application priority.
In its most basic configuration, DPM backup reporting automatically collects operational data from backup application databases and logs or, more rarely, through server-based agents. The data is stored in databases such as Oracle or MySQL for historical reporting and presentation interfaces. Views and reports enable IT to assess data-protection levels and to mitigate poor performance or outright failure.
DPM 2.0 includes this basic model, but improves on it by cross-correlating information from multiple backup applications and systems and by enabling predictive analysis. These features grant even better performance across an entire storage infrastructure and enable backup operations to fulfill different service levels. The ability to maintain and verify service levels benefits not only business processes, but also meets regulatory requirements. We will take a closer look at three primary DPM 2.0 capabilities: cross-domain insight, cross correlation, and predictive analysis.
Reporting on real-time storage capacity across multiple storage systems -- whether disk, tape, or VTL -- gives IT an accurate picture of the capacity of the entire backup-and-recovery environment. The real-time nature of cross-domain insight allows IT to quickly assess storage capacity and forecast usage.
Cross-domain insight enables transparent discovery of devices, operations, and workflows across the technology stack, and provides centralized views and control of heterogeneous storage environments. This capability rescues IT from trying to manage data-protection domains in discrete silos. Instead, DPM correlates discovery and presents it as a workflow-centric and holistic view. Analysis and trending functions allow IT to act on this intelligence to match service levels to applications, to lessen risk throughout the storage infrastructure, and to optimize tape-based archives and VTLs along with disk-based storage.
If the latter point appears to skirt the boundaries of storage resource management (SRM), that is because it does. DPM and SRM are by no means the same thing: SRM is primarily a policy enforcer for disk-based performance and capacity management, while DPM provides strong levels of visibility and control across the entire data-protection domain. Certainly DPM 2.0 can enforce policies, controlling activities such as event completions, maintaining utilization levels, and demonstrating compliance with recoverability objectives. But unlike SRM, DPM can be a much broader application that centrally manages data protection across the entire storage infrastructure. DPM 2.0 can embrace the entire data-protection technology stack, including tape and disk, as well as both block and file. DPM 2.0 cross-domain insight enables IT to closely monitor secondary data copy operations whether they occur on disk or tape, and across multiple methods, such as backup, snapshots, or CDP.
Cross-domain insight enables IT to understand if a backup has succeeded or a data set is recoverable -- and if there is an endemic problem, to clarify the solution. Cross-domain insight grants this level of understanding into risks to data, allowing them to be understood and mitigated at the outset before real damage is done. This level of insight into the entire backup-and-recovery environment results in dramatically lowered risk exposure for the business.
Storage complexity is huge at the enterprise level, and maintaining visibility into resource usage and optimization is exceptionally challenging. Attempting to manually correlate and analyze resource interactions is no longer possible to do on an ongoing basis.
DPM 2.0 offers a solution in the form of cross-domain correlation: the ability to correlate and control relationships and workflow between multiple domains in the data center. Cross-domain correlation federates information on physical and virtual resources and their relationships in heterogeneous storage infrastructures. This provides the foundation for presenting data from multiple sources and layers in a centralized view, and correlating discrete pieces of information into a unified reporting structure.
The scope of cross-domain correlation includes not only traditional servers and storage, but also related network connections, applications, and data-protection services. Cross-domain correlation visualizes the relationships between physical and virtual resources and connections. This enables IT to solve misaligned, conflicting, or over-capacity resources throughout the application data path.
As opposed to device-centric reporting mechanisms, DPM 2.0 tools focus on system-wide data-protection workflows. This level of integration enables IT to visualize and fine-tune resource interaction, supports real-time insights, and grants the ability to do current and future predictive analysis across the technology stack.
Cross-domain correlation then goes beyond reporting by adding the basis for predictive analysis across the entire data-protection environment, from host to business applications to backup applications to network to storage -- and back again for recovery purposes. It is crucial to identifying how servers, applications, network resources, and storage are interacting with the backup applications. Without this level of command and control, it is impossible to analyze and mediate backup-and-recovery operations across domains and technology stacks.
Predictive analysis accomplishes two major deliverables: forecasting and service level protection. Forecasting enables IT to accurately project future storage requirements from current usage levels and the rate of data growth. The predictive analysis operation collects information from across domains, correlates it, and feeds it into actionable intelligence.
The best of the DPM predictive analysis engines are highly flexible. This allows them to work on individual devices or components for specific troubleshooting purposes, but also to optimize entire systems by expanding across multiple events and systems. This level of predictive analysis grants IT the level of insight required to optimize the full backup-and-recovery environment to protect service levels and lower risk across applications and infrastructure.
DPM 2.0’s predictive analysis capability can detect both problems and also inefficiencies in the storage infrastructure. Predictive analysis goes far beyond the reporting function as it alerts both actual and predicted. This capability is invaluable in reducing risk and saving money resulting from downtime or missed service levels.
This level of predictive analysis automatically alerts administrators to performance problems, service level risks and violations, non-optimized backup environments, and poorly realized recovery time objectives. Some predictive analysis engines sense possible problems before they become looming ones, giving IT the chance to correct issues before they impact the business. This also enables IT to maintain service levels. Since predictive analysis identifies potential problems, IT administrators can intervene before service levels are impacted -- without having to maintain a large staff to do manual root cause analysis and repetitive system checks.
This level of predictive analysis is a key trend in data protection and represents a critical shift away from traditional historical analysis tools. The flexibility of predictive foresight is far superior to after-the-fact historical analysis.
DPM is changing and evolving. Data collection and reporting remains a DPM foundation, but DPM 2.0 has moved beyond basic reporting into cross-domain insight, correlation, and/or predictive analysis. No matter what its specific implementation, DPM 2.0 tools allow IT to manage backup processes so that each one is specifically aligned to the business service it protects. The ability to practice cross-domain insight, cross-domain correlation, and predictive analysis are the characteristics of this level of DPM.
Christine Taylor is a research analyst with the Taneja Group consulting firm
Representative DPM vendors
(Note: Not all of these vendors’ products qualify as DPM 2.0.)
Akorri, Aptare, Bocada, EMC (WysDM), Microsoft, ServerGraph, Symantec, Tek-Tools