Enterprise data centers are usually joined at the hip to high-priced, feature-heavy backup software. For large data centers, this arrangement works quite well: There is money in the budget for costly licensing and support; the enterprise is deeply invested in proprietary hardware and software; CIOs and storage managers need to quote Tier-1 vendor names to executives; and specialized administrators are on hand to support the complex software.

And then there are the other data centers: smaller companies that need enterprise-level features but without the complete feature set, high cost, and proprietary vendor lock-in of the enterprise backup products. Lowering costs and improving flexibility are important to these firms, and some are turning to open-source software to achieve their goals. Now that professionally supported versions of open-source backup software are available, these organizations are increasingly looking to apply these programs to their open-source mix.

Enterprise backup software vendors-CA, EMC, CommVault, IBM/Tivoli, and others-are well- entrenched, and large companies with big budgets and specialized IT staff are not likely to replace their enterprise-level backup software investments, although their workgroups and remote offices might. But some mid-sized businesses are making the shift to open-source software, driven largely by cost considerations. For businesses needing flexibility and low cost along with some enterprise-level backup features, an open-source application can provide significant benefits. Still, there is resistance and a lingering feeling that “open source” equals “wild, wild West.” That is why the availability of professionally supported versions of open-source backup software is driving end-user adoption.

Widely accepted open-source tools already include databases (MySQL), J2EE environments (JBoss), server virtualization (Xen), content management (Mambo), and security tools (GNU, Metasploit Framework). And the open-source storage segment already has a strong entrant in Amanda, which has a secure foothold in Linux environments. The increasing use of Linux systems in enterprises of all sizes is helping to fuel adoption of open-source software for storage applications.

Open-source backup is a relatively new phenomenon. There are several reasons for this, including end users’ reluctance to adopt open-source software in production environments and a lack of professional support. However, two things have changed: implementation of enterprise features and subscription-based support. These changes are a boon for businesses that need some enterprise- level features without the high cost and inflexible feature sets of proprietary software, and for businesses that prefer professional support for their open-source applications. Linux followed a similar path: Enterprises did not widely adopt Linux-based systems until supported versions became available.

Advantages Of Open-Source Backup

The key benefits of open-source backup software include flexibility, security, and lower cost.

One example of the flexibility benefits of open-source backup software is the ability to recover data using a variety of means, which avoids vendor lock-in. Commercial backup products use proprietary backup algorithms, tools, and data layouts so that the only way to recover data backed up with the product is by using the product itself. In contrast, if the open-source backup application uses common industry-standard tools such as tar or dump, instead of proprietary tools and data layouts, then IT does not require the same backup application to recover the data.

Open-source backup software also allows companies to optimize the application for their own environments. Large enterprises have the financial leverage to work directly with backup vendors on product enhancements. Smaller companies do not enjoy this level of access and can only submit their enhancement requests to their proprietary backup vendor and hope there is enough of a business case to add the enhancements to the next version. With an open-source approach, companies can add enhancements themselves; work with a large community of programmers; and request low-cost enhancements from open-source support vendors.

Another flexibility advantage is avoiding vendor lock-in. When a business uses a proprietary data format for its backup, it must be able to use that same proprietary software for long-term recovery-which could be many years into the future. In contrast, open-source backup software does not rely on proprietary data formats. Instead, applications such as Amanda use standard operating system utilities such as dump and tar, or use open-source utilities readily available in common operating systems, such as GNU tar, smbtar, and Schily tar. The same archive format exists on the media, so in 10 years or more, when the data needs to be recovered from long-term media, any access to a common command set will restore the data.

Software such as Amanda is also capable of backing up to a variety of targets, including disk, virtual tape, and physical tape. Disk-based backup is a critical part of today’s backup environments, as is the ability to emulate tape drives and libraries. And physical tape is a fact of life in many corporations for long-term data retention and compliance.

The question of security in open-source backup takes three forms: 1) Can malicious users exploit security holes since the code is open? 2) Does the backup software offer robust security features? 3) Do the security features address data at-rest and data in-transit?

The first question applies to the entire open-source industry, with some proprietary vendors insisting that open-source software is inherently insecure. This conclusion is faulty, because proprietary software can be (and is) reverse-engineered, and poorly coded software, whether open or closed, will have security holes. Well-built code will not have security holes, and in this respect secure open source is no different from secure proprietary code.

Open-source software has the additional advantage of having many eyes in the programmer community looking for security holes and quickly closing them, as well as rapidly updating source code to address new security standards.
The second question is: If the program offers security features, could malicious users break the code to hack encryption? The answer is “no;” encryption security is all about encryption keys. Without the key, hackers can do all the reverse engineering in the world and not be able to break encrypted information.

Open-source software such as Amanda works with Security-Enhanced Linux and other firewalls to secure communications between backup servers and clients. Amanda also uses Linux-based software encryption that has the flexibility to encrypt on the backup media and on the backup server, and to protect both data in-transit and data at-rest. OpenSSH protects data in-transit with strong authentication and authorization; backup media encryption can use symmetric or asymmetric encryption; and data can be encrypted at a highly granular level.

One of the primary drivers behind open-source software is cost, and open-source software is typically free. However, companies do add soft costs via staff time for research, installation, maintenance, testing, and customization, and often purchase support or customization services from commercial firms. Yet even with these soft costs, overall prices are significantly lower than proprietary applications.

Yearly support subscriptions add further advantages over high-cost licensing: The company is not tied in to paying vendor support for the life of its product, but it can simply choose not to renew its open-source subscription option.

Amanda And Zmanda

Amanda is free, downloadable, Linux-based client/server data-protection software. Amanda’s initial concentration was in universities, labs, etc., although it has more recently made inroads into mainstream businesses. This is particularly the case where applications are deployed on a LAMP (Linux/Apache/MySQL/PHP) stack. Amanda runs on a Linux backup server and works with Windows, Mac OS-X, Unix, or Linux clients. Amanda lacked a supported version until Zmanda added supported versions and enterprise-level features. Zmanda also offers backup protection for open-source MySQL with Zmanda Recovery Manager, including hot backup and point-in-time recovery.

The current release of Zmanda’s version of Amanda allows administrators to set up a single server to back up multiple hosts to a tape- or disk-based storage system over the network. It can back up workstations or servers running various versions of Linux, Unix, Mac OS-X, or Windows.

Amanda has command-line and GUI recovery tools and a central management console, as well as industry-standard backup tools and data layouts so IT can recover data using standard operating system tools. A scheduler optimizes the backup level for different clients to allow for consistently sized backup times across the backup environment.

These capabilities benefit the backup environment by specifying consistent backup cycles, where full backups of all volumes are completed in shifts throughout a single week. This results in consistent and shorter backup windows and takes less drain on resources, since the backups stay about the same size throughout the cycle.

Zmanda and a large community of programmers are also making constant enhancements to Amanda. Upcoming enhancements include support for continuous data protection and virtual tape libraries, as well as archiving and data classification tools.

Pricing for Amanda is low compared to proprietary enterprise backup software, since the source code is free. Zmanda offers a yearly subscription model for support. Even in its supported Zmanda versions, Amanda is not for everyone. Large data centers will stick with their existing backup vendors and applications as the cost of migrating would be enormous, and there is no compelling reason to make the change. However, open-source backup software could potentially play a significant role in the $3 billion backup-and-recovery market.