By Jame Ervin and Andrew Gilman
Implementing a disaster-recovery (DR) plan requires
(a) a huge budget
(b) lots of new infrastructure
(c) lots of new specialized software and tools
(d) all of the above
(e) none of the above.
If you chose (e), you’re either an early adopter of iSCSI and virtual server technologies, or you’re just plain lucky. If you answered (a), (b), (c), or (d), you might be surprised to learn that developing a solid DR strategy can be accomplished using the technologies you already have today and/or are evaluating. Using virtual servers and iSCSI can help your organization maintain business operations in the event of hardware problems, an emergency, a disaster, or a power outage.
Since wide area networking has become commonly available, organizations have been looking for a simple way to use the WAN to move their technology and infrastructure to another site to
- Allow remote employees to access information;
- Protect operations in case of an emergency; and
- Provide additional bandwidth and processing power for critical applications.
In the past few years, virtualization has made it possible for organizations to unbundle applications from specific hardware, increasing flexibility, simplifying migrations, and making application servers portable. Virtualization has made significant inroads in servers, networks, and storage. Today’s virtualization-powered systems offer advanced features and speedier deployment times with a virtual layer centralizing management and functionality.
Server virtualization enables organizations to consolidate physical servers into a single system running multiple operating systems and serving a variety of applications and increasing system utilization. With virtualization, servers are unbundled from the physical hardware. Each virtual server represents a disk image of a particular server, including the operating system, system settings, and applications, which can be accessed on another machine or via another hypervisor or virtualization application to make servers portable. Once application servers are portable, it is easy to move them from server to server or site to site for availability, redundancy, or recovery.
Storage consolidation, portability
SANs started as a way to centralize all storage, storage management, and storage provisioning on a dedicated, isolated network. SANs eliminated the need for a 1:1 ratio for application servers to storage while freeing administrators from management headaches. They evolved as a way to introduce advanced features into the network to make life easier for storage administrators by adding network-level backup and recovery, snapshots for granular recovery points, and a framework for data migration. Early this decade, a variety of vendors introduced these functions on lower-cost IP networks, making it possible for IT generalists to manage storage.
Conventional SANs were initially limited to Fibre Channel, requiring a dedicated network with dedicated infrastructure and personnel resources. iSCSI-based IP SANs provided more options.
Implementing an iSCSI infrastructure can reduce SAN deployment and management costs, with a lower cost per port than Fibre Channel alternatives. iSCSI-based IP SANs can also deliver many of the advanced storage management and DR tools available on enterprise-class storage arrays at a lower price point, opening advanced storage functionality to all organizations. In addition, iSCSI is supported by all major operating systems and platforms.
IP SANs were designed for flexibility, initially offering standard functionality such as WAN-based replication, snapshots, and other technologies to streamline disaster recovery (DR). As IP-centric technologies and applications such as disk-to-disk backup and server virtualization gain momentum, iSCSI installations continue to take off for a range of organizations. With portable storage and servers, IP SANs make planning and preparing for disasters, supporting emergency situations, deploying new applications, and managing infrastructure easier than ever.
Reduce system downtime
Every virtual server image contains the following key attributes: OS, disk resources, processing resources, allocated memory, network settings, LAN connections, and installed applications. Each component should be reproducible on any LAN or in any location to simplify recovery or migration processes and minimize the number of steps required to restore services during a critical incident.
Networked storage is ideal for virtual servers since application processing is offloaded to the storage system and storage resources are independent from physical hardware, speeding up storage and server provisioning. The storage system manages data access, moving data from location to location and freeing up CPU cycles to allow more servers to be consolidated to a single server, server cluster, or blade server. This becomes critical when planning the allocation of resources to a secondary site: The smaller amount of physical hardware required, the easier it is to set up the site and reduce implementation expenses. Using built-in storage features such as mirroring and snapshots allows administrators to create virtual server images quickly, reduces planning overhead, and decreases the need to rely on tape backups to recover data.
Many organizations are reluctant to devote significant time and energy to DR planning, since it seems highly unlikely that a critical situation or disaster will take place. However, have any of the following situations happened to your organization?
- A system upgrade resulted in unexpected downtime;
- A power outage in your main data center (or headquarters) made all IT infrastructure inaccessible;
- A system bug took longer than expected to resolve, delaying the implementation of a new system, or causing a substantial decrease in your typical level of service;
- For some reason, your sales team lost access to your Website for a significant amount of time;
- Your team was unable to send or receive e-mail due to a server upgrade, network outage, etc.
Although none of these events would qualify as a disaster, they could be catastrophic, or disruptive, to business operations. With a proper plan in place, your team will know how to handle situations such as these, as well as more serious issues.
An effective DR plan addresses the areas where downtime, unexpected technical difficulties, or other potential issues can arise, and offers an alternative process or procedure in these situations. This plan should not be limited to just technology and infrastructure variables, but should also encompass technical resources, availability of people, and infrastructure. Building an appropriate plan requires IT, operations, and facilities teams working together. Consider improving your plan with a different set of assumptions.
For example, what if the primary IT administrator will be unavailable for a few hours? How can you make it easier for less experienced IT personnel to implement your DR plan? Do you have an instruction manual for non-technical people to put the plan in motion? Alternatively, the team from your secondary site will be responsible for getting the new systems online, with minimal assistance from the primary location team. Does your team in the secondary location have remote access? What happens if they do not? Will they still be able to proceed? What do they need to offer at least a minimum level of temporary service? Can your vendor(s) offer assistance in an emergency? Will you need your facilities team to provide access to the server room, building, or office? Do you have all of the
potential team members on your approved access list or on the emergency list?
By answering some of these questions, acting out potential scenarios, brainstorming other critical points, etc., you will ensure you have accounted for the critical technical and non-technical aspects of the plan—and that it will run smoothly.
Here are a few ideas on how to use virtual server and IP storage technologies to improve your infrastructure:
- Use high-availability features of hypervisors to build an alternate real-time fail-over site for your primary infrastructure at a second location in your LAN or campus;
- Move critical data from the primary location to the secondary location—applications, systems, and disk images—to have a usable alternate version and copy of all application servers; and
- Mirror all critical data to a secondary SAN volume, and clone physical servers to virtual-server images at the remote location to reduce system downtime during upgrades, maintenance, or emergencies.
Disaster recovery and business continuity planning are no longer reserved for large organizations with huge IT budgets. Organizations of all sizes can now be prepared to maintain 24×7 business operations.
Jame Ervin is the SNIA IP Storage Forum (IPSF) Education Committee Chair and product marketing manager at StoneFly.
Andrew Gilman is an IPSF Marketing Committee Member and solutions marketing manager at Dell.
SNIA’s IP Storage Forum at-a-glance
The IP Storage Forum (IPSF) focuses on fueling adoption of IP-based SAN solutions by
- Promoting the features and benefits of IP storage;
- Enabling vendors to collectively educate users, channel partners, and other partners on IP storage technologies, applications, and solutions; and
- Providing a vendor-neutral venue to demonstrate IP storage solutions in real-world contexts.
The IPSF executes its charter through education and promotional events around the world, with a primary focus on seminars, tutorials, IT industry events, and hands-on demonstrations.
In 2008 and 2009, the IPSF expects to see continued growth in IP storage technologies driven by
- Replacement of direct-attached storage in business-critical Windows environments;
- Solutions for Linux, Unix, and virtual server environments; and
- 10Gbps Ethernet solutions for high-performance environments. For more information about the IPSF, visit www.snia.org/ipstorage.