Evaluating options for Windows clusters

Options range from basic high-availability and load-balancing clusters to virtual machines and data-sharing clusters.

By Brad O’Neill

early every IT manager has a horror story revolving around server or storage clustering: deployment complexity, integration issues, failed high-availability schema, over-provisioning of server and storage resources, and poor resource utilization. Whether dealing with a simple four-node server cluster to achieve high availability and load balancing, or creating a high node-count “scale-out” data-sharing cluster on a SAN fabric, clustering has been a sport for the strong stomached.

The challenges associated with enterprise-level clustering have been particularly acute in Windows environments. However, this situation is improving at both the server and storage layers. Over the past year, we have seen a significant increase in the number of users deploying advanced Windows-based clustering technologies. This trend, in turn, can be traced back to three other trends:

  • The use of Windows for mission-critical applications;
  • Intel servers driving “scale-out” architectures; and
  • Consolidation of Windows platforms.

This article takes a brief look at each of these clustering drivers and then examines various approaches to Windows-based clustering.

Windows for mission-critical apps

Windows clustering solutions are predicated on an abundance of Windows server deployments in the enterprise-a trend that is very strong in medium-sized enterprises. The primary reasons for Windows deployment are price, ease of use, and a “multiple solutions focus” (e.g. MS Storage Server, Exchange, SQL Server). With more than half of all data-center servers now running Windows for mission-critical applications, it is no surprise that end users are evaluating options for more-complex Windows clustering solutions.

‘Scaling out’

Increased interest in advanced Windows clustering correlates directly to the adoption of Intel-based servers. Users are migrating from “scale-up” SMP Unix architectures to “scale-out” Intel-based server clusters that accomplish the same tasks across many machines. The primary applications are scalable databases and application server farms.

In some cases, the hardware costs for Intel-based clusters are one-tenth of the alternative for equivalent computing power. However, Intel servers running Windows do not inherently constitute a horizontal “scale-out” replacement for high-end SMP servers. They don’t even come close.

While the term “scaling out” encourages cool marketing images of thin servers and blade architectures replacing an aging SMP machine, the real engine for change resides in tightly coupled systems software-specifically, various forms of clustering software that weave Intel servers together into a computing platform for enterprise applications.

Windows consolidation

After years of departmental file serving and general-purpose computing, many Windows shops now find that they are paying a high price for poor utilization. Viewed from both a compute and capacity standpoint, utilization rates typically begin to erode as more Windows servers are added. The answer? Consolidate the Windows environment to recuperate return on investment and to gain the benefits of unified management.

The consolidation path often leads to increased interest in advanced block-level clustering solutions that can share all of the data and provide scalability and unified management.

However, there are a number of different ways to cluster Windows servers and storage systems.

Users commonly build Windows clusters to support high availability and to load balance the workload across a group of servers. This requires that all the servers be able to access the other servers’ data and be managed as a single entity. High availability and load-balancing clusters are a prerequisite to more-advanced clustering architectures. These basic clusters are characterized by smaller node counts (typically two to eight servers) and local (non-networked) storage. High availability and load balancing can be applied to any kind of Windows server application, but they are most commonly deployed in support of application servers, e-mail servers, and small-node-count databases.

The two leading products for this type of basic Windows clustering are Microsoft Cluster Server (MSCS) and Veritas Cluster Server (VCS). Until recently, most IT executives would not give Microsoft serious consideration versus Veritas for basic cluster services, but Microsoft has steadily expanded the functionality of MSCS. Many users now find MSCS more than adequate for basic clustering requirements, although Veritas maintains a functionality edge in both the breadth of its operating systems support and node-count scalability.

With all clustering solutions, users face an inherently expensive, redundant exercise: Deploy and federate “more than enough” servers to ensure availability and performance. But there are ways to offset these costs.

One method is to leverage virtual machine technology to “virtually consolidate” server clusters on a smaller number of physical machines, with all virtual assets running the same server clustering software. For example, an IT shop might deploy two higher-tier physical Windows servers, each with four virtual servers running inside-all of them cross-connected via the same clustering software from, say, Microsoft or Veritas. The leading provider of this enabling technology is VMware (an EMC company) with its GSX Server application. For enterprises considering consolidation of a large number of Windows servers, leveraging virtual machine technologies such as VMware’s might make strong sense from an ROI perspective.

The downside to deploying virtual Windows clusters is that all server assets, virtual or otherwise, reside in a smaller number of (higher-quality) physical servers. This is a strategy that may run counter to some organizations’ availability practices and the trend toward higher numbers of lower-cost assets.

The newest types of Windows clusters are sometimes referred to as “data-sharing clusters,” which are similar to the data-sharing clusters used in Linux environments. Data-sharing clusters enable Windows servers to dynamically share a single pool of data on a networked storage fabric (“storage-aware clustering”). A key architectural necessity for this type of clustering is a common file access method across all Windows server nodes. Today, this means reliance on a cluster file system (CFS) deployed on each server, which is the architectural inverse of how NAS provides data sharing by consolidating requests for file access on a single filer.

With a CFS, a data-sharing cluster creates a compute environment where any server has equal access to any data element, providing the ultimate insurance for availability and scalable performance. According to Jesse Correll, IT director at MetLife Investors, “There are a number of reasons to look hard at data-sharing clusters for Windows. The economic gains of server and storage consolidation are well proven, and it is far easier to manage storage and servers as one networked grid than as dozens of independent assets.”

This class of Windows cluster still incorporates the high availability and load-balancing functionality of “basic” Windows clusters, but makes it possible to completely decouple the physical relationship between server and storage resources and to scale them independently.

The economic impact of a data-sharing cluster can be significant because all Windows servers can share a single copy of the data resident on a SAN fabric. For this reason, it can become a powerful tool for Windows consolidation, reducing the number of servers required to support applications. Data-sharing clusters also enable the full range of “scale-out” architectures to be deployed for a range of application and database servers.

PolyServe is one company that is benefiting from the trend toward advanced Windows clustering. The company’s Matrix Server for Windows application incorporates both high availability and load balancing with a Windows-based CFS for data sharing on a SAN. (Hewlett-Packard resells PolyServe’s solution.)

IBM also addresses Windows clustering with its SAN File System, which enables heterogeneous (multiple types of operating systems) data sharing on a storage fabric.

To date, EMC has not announced an integrated data-sharing cluster solution for Windows. Perhaps most importantly, Microsoft has not yet introduced a data-sharing cluster solution.

The combination of both economic gains and performance benefits that result from powerful clustering technologies translates into flexibility, cost savings, and better performance. It has been awhile coming, but serious Windows clustering technologies are finally here.

Brad O’Neill is a senior analyst with the Taneja Group consulting firm (www.tanejagroup.com).

This article was originally published on March 01, 2005