Building and managing SANs
As the number of internal and external users increases within enterprises, more servers and capacity are needed. Multiple servers are commonly connected to many gigabytes or terabytes of disk storage, which cannot be adequately moved through current network pipes, such as gigabit Ethernet and ATM.
By Robin Deshayes
Managing access to this "superdata" is also expensive and time consuming. Consider a server that runs out of storage capacity. Prior to storage area networks (SANs), IS managers had two choices: 1) purchase new storage and bring down the server to install it or 2) take capacity from another server and bring down both servers for the service change. With a SAN, IS managers simply open the SAN administration window, view available LUNs, and reassign or add storage where it is need--without losing data availability.
A SAN is a network architecture that alleviates the problems associated with the proliferation of servers and storage. SANs are inter-server networks with integrated hardware--usually Fibre Channel--and software, which form a high-speed backbone and enable clusters of servers to share storage arrays with exclusive data access or data on common storage devices. A SAN puts the network between the file system and the secondary data storage, providing a storage/server network between a cluster of servers. In doing so, a SAN improves server performance and data availability through such features as fail-over, load balancing, and distributed applications.
Network-attached storage, on the other hand, operates through a set of network protocols such as IP or NFS and attaches directly to the LAN, between the application server and the file system.
In short, wherever there are multiple servers and mass storage, SANs may be the answer for higher performance and a more efficient way of configuring and managing that storage. Emerging applications include data warehousing, large databases, on-line transaction processing (OLTP), data backup and restoration, and web serving.
So, what is driving demand for SAN functionality? Demand for less system duplication is one impetus. System duplication can be reduced by consolidating and centralizing server/storage assets and by clustering servers. Another factor is the increasing number of users, which fuels scalability requirements and boosts data availability requirements.
The importance of clustering
A SAN is a subset of a group of technologies known as clustering. Clustering is critical for users who need high availability, scalability, and ease of management. It is expected to become a dominant force in the client/server market in the next two to five years, as evidenced by recent development efforts by such companies as Microsoft and leading Unix vendors. This growth is being driven by:
- An increasing number of applications and servers that must be managed between business units.
- Demand for larger "single-view-of-data" systems to support virtualization of organizations and the externalization of business functions.
- Availability of serial technologies such as Fibre Channel, which enable access to disks by all nodes in a cluster over a high-speed interconnect.
- Demand for lower system costs and better control and use of computing resources, reducing the total cost of ownership.
- High-availability clustering--system-managed fail-over to another node within a cluster.
- Administrative cluster--management and allocation of resources from across the cluster, but each application still runs on one node. Includes redundant resources for restarting applications.
- Application clustering--management of a specific application across a cluster through tight integration with the application`s API (e.g., SAP/3). An application cluster does not include parallel databases.
- Scalability clustering--a specific workload is spread across multiple nodes with the use of system functions, usually using a parallel database.
Industry consultants are advising users to co-allocate servers and disks at each business site, to switch to serial-based disk technologies such as Fibre Channel, to organize clusters by workload type, to start with administrative clustering, and to implement automated procedures that can move workloads from one node to another.
SANs vs. LANs
Traditional networks use servers running network protocols to transfer data to clients. In a traditional SAN, each client and server transmits data via a "network stack"--a complex set of communications instructions that must be followed in order for data packets to be properly sent and received. These instructions take enormous network bandwidth, CPU cycles, and memory to complete.
This type of system is adequate for standard business settings where the data requests are small (averaging less than 200KB per document). However, if the data transfers are large, a client/server network is often inadequate and prone to bottlenecks.
In these instances, SANs are helpful. The design of the SAN eliminates traditional bottlenecks, including the server and network stack, and replaces them with software and hardware that allow direct access via Fibre Channel. Clustered servers in a SAN environment scale past the "client-side" bottleneck by distributing applications among multiple servers, and they reduce the total cost of disk administration.
Framingham, MA-based International Data Corporation reports that clustering allows disks to be centralized via SAN technology, reducing the cost managing storage from 55% to 15% of the total cost of the hardware.
Fibre Channel is a serial I/O and networking protocol. Today, data can be transmitted at 1.062Gbps over Fibre Channel, and that number is expected to quadruple over the next few years.
Fibre Channel offers very high bandwidth and very low overhead and latency, using very large block sizes to transfer data from disk with delivery guarantee at the hardware level. With Fibre Channel, connectivity between massed storage and "multiple initiators" (multiple hosts directly accessing the storage) is easy.
The emerging availability of Fibre Channel hardware is driving the development of the cluster/SAN environment.
A SAN design includes Fibre Channel hardware, SAN management software, and configuration/integration processes (such as data paths and backup procedures). SAN management requirements fit into three general areas:
- Systems control refers to operations that enable IS manager to modify the state of the SAN resources. Examples include the assignment of LUN or security privileges and the configuration of storage resources. Systems control must be able to address heterogeneous server nodes on the network and it must be secure. It must also provide a single system image, allowing administrators to view and manage all storage and nodes from a single point. And, lastly, nodes must be able to be added and removed without disrupting the network.
- Systems monitoring refers to the ability to monitor the state of the SAN (e.g., software and hardware inventories, capacity, use, and performance) and to receive notifications and alerts. Examples include the ability to understand the relationships between storage subsystems, server nodes, fabrics, and the allocation of assets within the network. Like systems control, systems monitoring requires the SAN to be seen as a single system image.
- Systems servicing refers to activities for fixing problems or for performing preventative maintenance (e.g., diagnosing hardware problems on components within the SAN).
LEFT: In a traditional client/server network, data storage is isolated "behind" the servers.
BELOW: In a storage area network, data storage is allocated among servers.
Robin Deshayes is vice president of marketing at Transoft Networks, in Santa Barbara, CA.