Even though Symantec NetBackup was moving all backup data over the SAN, it was also generating a very significant amount of overhead traffic over the LAN network. As a result, LAN Network efficiency was a distinct gating factor for NetBackup.
Network efficiency of a backup operation, can be represented by the average network throughput over the time taken by the backup process. In particular, the area under the graph for average throughput rate over backup process time gives a measure of the work performed, which also equates to the volume of data transferred during the process.
Using this measure of network efficiency, CommVault Simpana moved 1.44X more data than Avamar over the LAN. More importantly, just Symantec NetBackup’s overhead traffic equaled .53X of the total backup and overhead traffic sent by Avamar.
With CBT meta data established for the VM, 2% more data—3GB—was added to the Exchange mailbox database. A CBT-based backup, which will be the dominant form of backup for most sites, was then run with each backup solution.
Avamar’s efficiency advantage came to the forefront, as the VM client proxy performed full global data deduplication guided by the Avamar Data Store. In particular, Avamar reduced the total amount of data acquired using VMware’s CBT to less than 500MB for transfer to the Avamar Data Store. In contrast, a CBT-based backup using Simpana took 10 minutes and 14 and a half minutes with NetBackup.
Far more important for IT operations, VMware’s CBT makes it relatively easy to create a fast block-based incremental backup regime that is far more efficient than traditional file-based incremental backups. By storing backup files containing the data generated in a full backup without CBT along with backup files associated with CBT-based backups in an ordered set creates a traditional forward incremental backup chain.
This approach creates a sequential series of recovery points corresponding to the full backup and subsequent incremental backups. Recovery of a VM to any recovery point requires that the full backup is restored as the starting point followed by the ordered restoration of the CBT-based incremental backups to roll up to the desired recovery point. Advanced data protection solutions run this rollup process automatically.
Breaking Incremental Chains
For IT operations, there is the distinct reliability issue associated with storing a forward incremental backup chain: Any corrupted or deleted file in the chain will invalidate all follow-on backup files in the chain. It is therefore imperative to keep any series of forward-chained incremental backups short.
To resolve the dependency issues of incremental backups while maintaining minimum backup windows, mid-range backup packages often implement a loosely dubbed “incremental forever” backup scheme that relies on a periodic synthetic full backup processes, which is a consolidation process and not a true backup. Without reading data from the client, a synthetic backup takes the last full backup—also probably synthetic— and all subsequent incremental backups to synthesize an ersatz full backup file on which to build a new forward chain.
The new synthesized backup is complete and independent of previous backup files; however, it is susceptible to perpetuating a corrupt or missing file, which is why synthetic backups are considered problematic at large enterprise-class IT sites. Given the reliability questions for synthetic backups, both Symantec NetBackup and CommVault Simpana recommend running a full backup without CBT every two weeks to initialize a new backup chain. While this strategy lessens the likelihood of losing a series of recovery points, it is very inefficient at processing each biweekly backup.
In tests with a 200GB VM Server running Exchange, full backups without CBT averaged about 45 minutes. Scheduling 26 full backups of that VM will add about 19 hours to backup processing over a year. Worse yet, this scenario needs to be repeated for every VM. Assuming perfect distribution of full backups, a site running 42 VMs that
require 20 minutes each for a full backup—which is typical of a VM running a database- driven application with about 75-to-100GB data—will add about an hour to each daily backup schedule to handle the recommended full backups. As a result, the mandate on IT operations to run periodic full backups adversely impacts backup scalability, which is tied to factors driving ROI for virtualization.
Avamar avoids this problem by never creating incremental backup files from CBT- based backups. Instead of creating a collection of discrete backup files, Avamar creates a virtual block space for the universe of protected systems. In any backup, with or without CBT support, blocks with rich meta data links are saved in Avamar’s global virtual space.
Every Avamar backup functions as a full backup. For every restore operation with Avamar, a full system image can always be navigated within the virtual block space for every CBT-based recovery point of every protected system. There are no incremental backup chains to close. There are no synthetic backup processes that need to run on the server. There are no requirements to run periodic full backups.
In a comparison test of a CBT-based Avamar backup with a biweekly full backup without CBT, which is recommended as a best practice by CommVault and Symantec, differences in backup times were quite dramatic. In the biweekly cycle, the Avamar continuous CBT-based backup process was 18.2 times faster than CommVault Simpana and 23 times faster than Symantec NetBackup.