Making the case for software-based replication

An ROI analysis of tape backup/restore, hardware-based mirroring, and software-based replication reveals cost/benefit tradeoffs.

By Jason Buffington

"What is your data worth, and how much will it cost to be without it?" That's the fundamental question behind choosing the most appropriate data-protection solution.

Jason Buffington
NSI Software
Click here to enlarge image

This article describes two metrics (recovery point objective and recovery time objective) and compares three data-protection technologies: tape backup/restore, hardware-based mirroring, and software-based replication. The two metrics can be used to better understand the pros and cons of each technology.

In a tape backup scenario, assume that a company does a full backup every weekend and some form of incremental/differential backup every weeknight. A full backup was completed over the weekend and two partial backups were done on Monday and Tuesday evening. A server fails at 4PM on Wednesday.

The recovery point objective (RPO) is based on the currency of the data when a restore operation is completed. So, regardless of how long it takes to restore from tape, the data will be in the state it was on Tuesday night's backup. That's the optimistic view. You should also consider the ramifications of a partial failure in the backup process. Allowing for the possibility that a tape might be corrupt, the data might be as it was on Monday night (if the Tuesday tape was bad) or as it was on the previous Thursday (if the weekend tapes were bad). So, RPO for tape-based backup/restore is measured in the days of lost data.

The recovery time objective (RTO) is the amount of time between a failure and when operations can resume. In the example above, the RTO of tape backup is the amount of time necessary to locate the tapes, mount the tape set, and restore the data. That could take hours, or more, depending on the amount of data.

A potential caveat is the use of off-site courier services. What if the tape with the most current data is on a truck or in a vault? In that case, the RTO will include the amount of time to request, ship, and receive the tape (plus the time to mount and restore). If the tape was not requested by the end of the workday, this might add an entire day to the recovery time. So the tape might be received by Thursday afternoon and the restore would happen that evening.

You also have to factor in how frequently failures will occur. If the above scenario were to impact a particular department three times per year, then you calculate the business impact as:

Click here to enlarge image


Next, consider how much a tape backup/recovery solution costs. This includes tape hardware, backup software, annual maintenance on the software, tapes and cleaning cartridges, and labor cost (which is probably the most expensive component).

Labor costs include personnel who deploy the software, maintain backup jobs, and rotate tapes, as well as off-site courier fees (if applicable).

To determine ROI, divide the BusinessImpact by the CostOfSolution.

Synchronous mirroring

One alternative to tape-based backup/restore is hardware-based synchronous mirroring. Assuming the servers are clustered, this approach is the epitome of high availability.

Hardware-based synchronous mirroring has an RPO of zero (due to mirrored disk arrays) and an RTO of seconds or minutes. However, this approach is very expensive because you need two (often proprietary) disk arrays, mirroring software (usually proprietary), redundant servers, clustering software, and trained personnel.

Going back to our formulas, hardware-based mirroring ensures almost no business impact.

Click here to enlarge image


However, the cost of this solution may be much higher than the cost of lost data. In this case, it may actually be cheaper to have the outages than the solution.

The above idea that the "cost of the solution" is too great compared to the "cost of the problem" comes back to the idea of "what is your data worth." A telemarketing department that generates significant revenue can justify a much more fault-tolerant (expensive) solution than, say, the shipping department. A two-day outage in the warehouse has a much smaller business impact than two days of lost sales. In each case, you need to understand the value of the data and business impact and compare that against the cost of potential solutions. The solution with the best ROI is most appropriate.

Most environments are caught in the middle, where they need a better solution than tape, without the cost of synchronous mirroring; in other words, we need better RPO/RTO but with better ROI.

Asynchronous replication

Another alternative for data protection is software-based asynchronous replication, which in terms of cost and level of data protection falls somewhere between tape-based backup/restore and hardware-based synchronous mirroring. Asynchronous replication software provides many of the benefits of controller-based synchronous mirroring, but at a lower cost.

Replication software is installed on all production servers and target servers. The software captures byte-level changes to files, as they occur. The process is "asynchronous" because it's limited to the speed of the network and available bandwidth. In the best-case scenario (where there is adequate bandwidth), the latency between source and target could be near zero (or seconds or minutes). Although not as "guarantee-able" as synchronous mirroring, software replication can be adequate for most applications. Unlike hardware-based mirroring, however, replication software can be used with existing (and non-proprietary) hardware. The cost of the software in most cases is on par with that of backup software.

In the case of replication software:

Click here to enlarge image


And unlike with most hardware-based mirroring solutions, you don't need identical production and target arrays, server clustering software, or special expertise.

The decision comes down to cost, the level of data protection required, and ROI. If it costs less for the solution than it does for the business impact, then it is "good business" to deploy it.

Jason Buffington is director of business continuity at NSI Software, in Hoboken, NJ. He can be contacted at jbuffington@nsisoftware.com.

This article was originally published on June 01, 2003