By Dave Simpson
Large enterprises typically have solid disaster-recovery (DR) and business continuity (BC) plans in place (although they generally don’t test them frequently enough). However, that is not always the case with small to medium-sized businesses (SMBs), which often don’t have adequate budgets for BC/DR.
For this report, we gathered BC/DR advice and tips for SMBs from a range of vendors. One of the most common pieces of advice is simple enough: Test your DR plans.
“SMBs should test and practice their disaster-recovery plans regularly to strengthen their skills, determine more efficient logistics, and work out kinks in the system,” says Mike Inkrott, Symantec’s senior product manager for Backup Exec. “SMBs should also test the backup itself [i.e., recover data] to ensure that critical data is available. If SMBs neglect to back up their data, their disaster recovery plan is useless. By simply practicing the plan, assessing the critical data that needs to be backed up and testing the backup system, SMBs can be confident that the plan will work should they need to use it during an actual disaster.”
“If I had to give one piece of advice to SMBs on BC and DR, it would be to practice what might take place if a disaster were to occur,” says Ellen Rome, vice president of sales and marketing at STORServer. “Too often, SMBs wait until the disaster takes place and then find they are not at all prepared with a fully laid-out plan and a practiced approach to data recovery.” Rome also advises IT organizations to determine which servers, applications, and storage resources are most critical, and to determine the order in which they need to be recovered.
In a business continuity survey sponsored by Stratus Technologies, only 45% of the respondents with BC plans tested them more than once a year, 35% tested yearly, and 20% never tested their BC plans.
Recommendations on how often companies should test their DR/BC plans vary widely but, generally, vendors and analysts recommend testing at least quarterly.
In terms of actual BC/DR implementation, perhaps the best news for SMBs is that they are no longer restricted to the expensive, costly, “vendor lock-in” solutions that characterized BC/DR options in the past. And not surprisingly, hardware-independent vendors stress low-cost alternatives for budget-strapped SMBs.
“DR solutions should be open and flexible, easily fitting into an organization’s existing IT infrastructure, and should minimize risk, implementation time, and cost,” says Fadi Albatal, director of marketing at FalconStor Software. Albatal advocates hardware independence, virtualization (with an emphasis on heterogeneous array support), and resource consolidation. He also notes that IT organizations are no longer required to have the same types of storage systems at the primary and secondary (DR) sites; users can keep expensive, high-performance systems at their primary site, but deploy less expensive arrays at their remote sites.
Although most IT managers view virtualization primarily as a way to lower costs, it can also be used as a basis for a disaster- recovery program. “If you virtualize your systems and storage, your primary and backup data centers can run disparate hardware, with the virtualization layer hiding the differ- ences,” says Barry Phillips, group vice president and general manager in Citrix Systems’ advanced solutions group. “Through the use of clustered computing, load balancing, replication, and remote access technologies, your downtime can be brought to zero and your data loss minimized.”
FalconStor’s Albatal stresses the importance of BC/DR technologies that provide full integration of the physical and virtual environments to enable DR process automation.
“The ability for virtual machines to move between servers in the event of a failure greatly simplifies and lowers the cost of application high availability and business continuance,” says Chris McCall, director of product marketing at LeftHand Networks. “Combine this with storage systems that present a single volume in multiple sites, so that when a failure occurs and virtual machines migrate over, they remain connected to their volumes. Applications and storage remain online, with no data loss or manual intervention.”
Software-as-a-Service (SaaS) is another cost-saving approach that might appeal to SMBs. “Managed and hosted SaaS solutions are a cost-effective alternative with limited up-front investment and IT management responsibilities,” says Frank Jablonski, senior director of product marketing at CA. For in-house BC/DR implementations, Jablonski also advocates evaluating virtualization technology, which can lower costs for recovery management. He also advises a multi-level approach to BC/DR, which saves money by applying the right level of data protection according to its value to the business.
Similarly, remote on-demand data-protection services can help defray BC/DR costs. According to Brian Reagan, director of strategy in IBM’s BCRS division, remote services eliminate the need for capital expenditures and can save 20% to 60% versus in-house BC/DR implementations, in part because hosted services are typically based on a “pay-as-you-use” subscription model. In addition, subscribers can more easily define and execute specific time-based data retention policies that match their business requirements.
Before actually embarking on a BC/DR implementation, SMBs should perform a business risk and business impact analysis, according to Kyle Fitze, director of marketing for SAN products in HP’s StorageWorks division. Business impact can be measured in both direct (such as lost revenue) and indirect (e.g., productivity impact) dimensions. The metrics should also be measured in quantitative (revenue, costs) and qualitative (customer satisfaction, brand reputation) dimensions, according to Fitze. With this data in hand, SMBs can decide how much downtime they can tolerate for a given application or system, and how much data loss is acceptable, which in turn will determine the best technologies to use for the BC/DR infrastructure.
Edgar Jimenez, director of managed services for the EVault data-protection business unit of i365 (a Seagate company), recommends the following actions during the business impact analysis (BIA) phase:
- Define critical success factors that will support and enable the BIA;
- Establish application restoration priorities (e.g., critical vs. non-critical apps);
- Identify tasks required to resume 100% normal operation (also known as business resumption); and
- Define data recovery and backup management procedures.
After identifying risk factors (e.g., natural disasters, system failures, legal/regulatory action) and business impact (e.g., loss of productivity, revenues, or legal liabilities), users should identify the criticality of various applications.
“Conduct a session with all business managers where business applications are charted on a matrix with ‘acceptable data loss’ on the Y-axis and ‘acceptable interruption’ on the X-axis,” suggests Bruce Caswell, director of marketing communications at Xiotech. “Both axes are divided into three layers, labeled ‘minutes,’ ‘hours,’ and ‘days.’ Each application is mapped into the appropriate section of the matrix, with discussion of the consequences involved for each application.” Caswell also notes that the current level of protection from existing systems can also be mapped on the matrix to demonstrate existing gaps.
Most vendors agree that assessing the criticality, or business value, of data and applications is a crucial step in organizing a BC/DR strategy. “Profile your applications and rank them in terms of value to the business, and assess the impact to your business if there were any downtime,” advises Jonathan Buckley, vice president of outbound marketing at Asempra. “And do similar exercises for data: What data is critical, and what data is not?"
Buckley adds, “We recommend bifurcating your BC/DR technologies depending on your total data requirements and your RPO [recovery point objective] and RTO [recovery time objective] requirements. The servers holding second- and third-tier data can take hours, or in some instances days, to recover without financial impact.”
Most BC/DR implementations today use disk-based technologies to some degree. Steve Whitner, product manager at Quantum, says that SMBs should consider disk-based backup with data de-duplication to minimize costs. However, Whitner notes that, at least for long-term retention of data, tape should also be factored into SMBs’ BC/DR equation because of its lower cost per gigabyte, as well as power and cooling advantages.
“Technologies such as backup-to-disk, de-duplication, and replication enable SMBs to tailor their BC and DR strategies to fit their specific requirements,” says Andrew Wenger, director of the SME segment at CommVault. “By backing up to disk, SMBs can replicate their data, move it to their main data center at headquarters, and implement a DR plan from there, making it easier to manage on an ongoing basis.”
Five steps to an effective DR plan for SMBs
By John Ferraro
There are five key steps that SMBs can follow when developing and implementing a disaster-recovery plan:
Performing a business impact analysis involves prioritizing and assessing the impact of events that threaten data loss, such as software or hardware failure, power disruptions, computer shutdowns due to worms or viruses, and natural or manmade disasters. A business impact analysis will document the impact of these events on critical applications, systems, and business operations. Determine which applications must be available when disaster strikes. For many companies, the “killer app” will be e-mail so that employees can maintain communication.
The next step is to identify the specific servers that need to be protected, and determine the extent to which data is changing for each critical application. This will help determine your RPO—the amount of data loss tolerable in the event of a disaster. Although the ideal RPO is zero, most organizations can tolerate losing 15 minutes to two hours of transactions without a major impact to their business. Then determine the amount of bandwidth that will be needed to perform the backup to remote servers via automated tools such as modelers.
To choose a solution that provides the ideal balance between recovery speed and cost, you should determine how much data you’re willing to lose, and how much time you’re willing to have data unavailable. The second key metric related to downtime tolerance, after RPO, is RTO, which is the time it takes for your business to come online and function normally when recovering from a disaster. Once again, the ideal RTO is zero, but an organization may afford to be down for a few minutes to a few days without suffering undue harm.
To test your disaster-recovery plan, use tools that accurately evaluate the scope of the plan, the usability of data and applications to be recovered, and the actual capacity and bandwidth required to meet your RPO objectives. Evaluate DR software that offloads the burden from production servers and runs in a heterogeneous environment so you will not be locked into a single vendor.
Lastly, it’s important to remember that disaster recovery is part of the bigger picture of business continuity. Make sure your DR strategy includes both local and remote backup and recovery. Incorporate technology that can capture data as it changes and store it in a time-addressable format. This provides for fast local recovery to any point in time, elimination of backup windows, and simplified verification, compliance, and auditing.
John Ferraro is president and CEO of InMage Systems.
Five steps to effective BC planning
By Jim Olson
Many small to medium-sized businesses (SMBs) still have ineffective (or non-existent) business continuity plans. However, it is imperative that SMBs recognize and plan for potential risks. Here are five steps to consider when you are developing a BC plan:
Consider a wide range of possible scenarios. Downtime—whether it is a result of a natural disaster, power outage, hardware failure, or human error—affects your IT infrastructure and ultimately your productivity and bottom line. When developing a disaster-recovery (DR) strategy, consider various possible disaster scenarios and plan accordingly.
Understand your time and data requirements. Understand how much time your company can afford to be down (the recovery time objective, or RTO) as well as how much data you can afford to lose (recovery point objective, or RPO). Armed with this knowledge, you can monetize your risk by balancing recovery requirements, risk tolerance, and how much you are willing to spend on a DR strategy.
Keep your people, systems, and information connected. Your BC strategy should encompass information, systems, people, and processes, as well as the complex interdependencies among them. If your workforce cannot connect to systems and data, there is no business. As part of the planning process, also consider what the best strategy is for your company in managing critical systems and components—either internally, with a managed hosting provider, or a combination of both.
Investigate advanced technologies. DR technologies have changed considerably in recent years, with solutions common in large enterprises—such as replication, vaulting, and virtualization—now becoming more affordable for SMBs. With these technologies, it is now easier for SMBs to achieve greater precision in recovery timeframes and data points.
Plan and test. Planning involves much more than just backing up your data. Many companies think they have an effective DR plan, but unless it is tested, it is only a plan on paper. It is essential to develop and test your plan so the first time it is executed is not during an emergency. The Yankee Group recommends running a DR test every quarter.
When the unexpected occurs—from unplanned downtime to a major disaster—the details and unplanned variables are what always impede a fast recovery. It is important to test the way you recover, and recover the way you test. Practice and experience pay off in a time of crisis.
Jim Olson is vice president of inside sales and solutions engineering at SunGard Availability Services