Windows NT clusters: the basics

Windows NT clusters: the basics

Clusters offer clear advantages over server mirroring, and Fibre Channel may be the best interconnect.

Frank W. Poole

With ever-increasing requirements for continuous streams of information and zero downtime, server clustering has become the focus of more attention. Simply put, a cluster extends RAID technology into the server arena by creating a configuration of redundant servers.

Clusters are part of a logical evolution. It used to be sufficient for system administrators to have current backup of pertinent data. When a server went down, it was pulled out of the rack, a replacement hard drive was ordered, the part replaced, the network operating system installed, network protocols reconfigured, and data restored from tape. A systems administrator was lucky if he/she only lost two days` sleep.

Then, redundancy came along. Mirrored hard drives reduced downtime significantly. Soon, though, system administrators started demanding better performance and expanded capacity. Thus, RAID technology evolved.

With "hot-swap" capability, SAF-TE enclosures, redundant power supplies, etc., downtime is eliminated. Or, is it? What about the other non-redundant components inside servers and networks? What if they fail?

With RAID, several independent hard disks are combined to form one large logical array. Data and "redundancy information" are stored on the array. The redundancy information can be the data itself, as in RAID 1 (mirroring), or parity information, as in RAID 5. The operating system no longer deals with separate drives, but with the array as one logical drive. RAID prevents downtime in the event of failures, but it does not protect data from events such as user-deleted files or catastrophes such as theft or fire.

Redundancy also applies to servers. The two most common approaches are mirroring and clustering. While mirroring is not new, server clustering is relatively new in NT environments, though it has been available in mainframe and Unix markets for some time.

With server mirroring, duplicate servers are attached to the same network. This is just RAID 1 applied to servers. One server does all the work while the other receives duplicate (mirrored) data and waits its turn. When one server goes down, the other takes over in a matter of minutes. This seems like an ideal situation because the machines can be placed in separate rooms, buildings, or even different geographic locations if a fast enough network connection is provided. (There are a few hardware restrictions on type of cabling, controller, or drives used.)

However, costs can get prohibitive in this situation because it requires a second non-productive server. There are now two servers to upgrade or replace. Also, degradation of the primary server is a concern with server mirroring. The primary server not only does all the file transfers for users on the network, but it also performs additional I/Os as it passes information along to the mirror server. Since this is done primarily with software, there can be additional processor overhead if system usage is heavy.


Server clustering, on the other hand, lets each server act on its own while using a common mass storage device. In a basic cluster, two servers share one RAID array. When one server goes down, the second takes over while still maintaining its processing load. This is known as load sharing.

In such instances, the failover time is greatly reduced because each server is already attached to the same data. There is also a lesser chance that data will be lost during a system failure because data doesn`t have to be sent to a backup system; it`s already there.

In addition, costs are reduced because a separate RAID array isn`t needed for each server and a backup server isn`t lying idle waiting for something to do. It is an active production server on the network.

Clustering is accomplished through polling. Each server in the cluster continually polls every other server to ensure everything is operational. Polling requires each server to be connected to the network and a common mass storage device, but to each other through some sort of interconnect device, usually a secondary network card. This interconnection, or private network, is merely a heartbeat connection.

For a failover to occur, all three connections get involved. If the failed server is no longer available over the heartbeat connection, it is polled on the LAN. If this gets no reply, then the polling server has to take over the failed server`s assignment of connecting users to the common data.

One downside to this type of redundancy is that the heartbeat connection requires another cable run. If the servers are located in the same room, that`s not a problem.

Connection types

But, what if you want to keep each server in a separate room and a RAID device in a third room? With SCSI, this is only possible if your rooms are very small and very close together. In fact, critical limitations of SCSI-based clusters include cable length and data transfer rates. SCSI specifications dictate that if four devices (controller and 3 drives) are connected to an Ultra SCSI bus, the cable length is limited to 3 meters. Connect eight devices and you`re down to 1.5 meters. Wide Ultra SCSI is capable of 40MBps transfer rates, but all devices attached to the cable have to share that bandwidth.

Low-voltage differential signaling (LVDS), or Ultra2 SCSI, improves on this limitation. For example, Wide Ultra2 SCSI supports up to 16 devices and a cable length of 12 meters (about 36 feet), which in many situations is long enough. Data transfer rates also improve to 80MBps. The coming Ultra3 will push rates to 160MBps, but will not lift cable-length and device limitations. Standard differential signaling (which is incompatible with LVDS) allows up to 25 meters. But it is very expensive and incompatible with other SCSI formats.

For even higher data transfer rates (100MBps) and longer cable lengths to enable more attached devices per cable, Fibre Channel-Arbitrated Loop (FC-AL) technology is a solution. When most people hear the word "Fibre," they think of optical fiber or light pulses. However, a serial, high-speed data transfer technology that can be applied to networks and mass storage, Fibre Channel is not limited to the transmission of optical signals through fibers. Less-expensive copper (twisted pair or coaxial) cables can also be used.

With FC-AL, up to 126 devices can be attached on a single channel at distances up to 25 meters using copper cabling. With multi-mode optical fiber transmission, up to 500 meters is possible, and up to 10 kilometers with single-mode optical fiber.

These distances are between devices, not the length of the entire cable as with SCSI. However, like SCSI, all devices on the cable share the same bandwidth.


Software is critical to any cluster. In the NT space, Microsoft Cluster Server software (MSCS) ships with Windows NT Server Enterprise Edition. Currently, an MSCS cluster consists of two NT servers, or nodes. Also, at least one drive on the shared storage bus must be configured as a Quorum Resource. This drive acts as a kind of information exchange area, or temporary information storage, which is used by both nodes and the applications running on them.

If one of the nodes fails, the remaining server takes over the job. The application that is being carried out on the failed disk is then re-started on the second node. Therefore, the application has to be installed on both servers. Possible temporary data or information is stored on the Quorum drive. The whole process (i.e., the take-over of tasks for the other nodes) is called "fail-over."

Applications can also be operated on various nodes, ensuring that all available resources are used to the maximum. However, an application cannot run on both nodes at the same time (which is called load balancing).

Microsoft, therefore, defined this as static load-balancing, or static distribution of resources. Dynamic load-balancing is not possible at this time. A speed advantage is only possible if, for example, two different applications are running parallel on different nodes of a cluster (e.g., an SQL database on one node and a Web Sever on another).

Configuration options

There are several different ways to set up a cluster. For example, you can use two cluster servers, each with a three-channel SCSI RAID controller configured with a RAID array (see Figure 1). By placing each set of disks on a separate controller, there is also redundancy at the SCSI connections. Therefore, failure of a cable does not lead to a crash of the array.

Using Fibre Channel technology, two dual-channel FC-AL controllers can be used with redundant Fibre Channel hubs in a dual-loop (redundant cable) configuration (see Figure 2). In this setup, the Fibre Channel hubs allow you to disconnect one node from the cluster and still maintain a level of redundancy. For maximum security, the same configuration could use redundant RAID enclosures (see Figure 3).

Clustering may not be required in your environment. For example, it may not be necessary for average networks that do not run mission-critical databases 24 hours a day, seven days a week.

However, in some situations, clustering is a requisite. Examples include financial institutions, or online banking, auctions, stores or other e-commerce sites that depend on uninterrupted access. Any cut in the link or downtime can practically shut down an organization.

These are just a few examples of the driving force behind zero downtime and the need for RAID and clusters as components in more and more enterprise networks.

Click here to enlarge image

Fig. 1: A relatively simple cluster configuration consists of two servers, each with a three-channel SCSI RAID controller with a RAID array.

Click here to enlarge image

Fig. 2: Two dual-channel FC-AL controllers can be used with redundant Fibre Channel hubs in a dual-loop configuration.

Click here to enlarge image

Fig. 3: For maximum data protection, clusters can be configured with redundant RAID arrays.

Frank Poole is senior technical support engineer at ICP vortex Corp., (www.icp-vortex.com) in Phoenix, AZ.

This article was originally published on July 01, 1999