With an in-band enclosure management protocol, the Intelligent Platform Management Interface can augment IT management in heterogeneous environments.
By Tom Brokaw and Glen Koziuk
The demand for mission-critical server/network/storage systems is growing dramatically as more companies require higher reliability, accessibility, and serviceability (RAS). At the same time, businesses also demand higher performance from their servers, extra bandwidth from their networks, and dramatically increased volumes of reliable storage. To make things worse, IT managers suffer from reduced budgets, demands for 99.999% uptime, and shortages of qualified personnel. As a result, IT managers are demanding robust management applications capable of operating in heterogeneous environments, while looking to reduce overall total cost of ownership.
Figure 1: A typical jbod configuration. In this example, a Fibre Channel JBOD is implemented with two environmental service cards, one for each arbitrated loop.
Enclosure management is used to monitor and control devices such as disk drives, fans, power supplies, and mechanical assemblies within servers and storage systems. Status and control information about these devices is communicated to the management application of the host server through a variety of in-band protocols (SCSI and Fibre Channel) or out-of-band protocols (RS-232, Ethernet, and IPMI).
An example of a typical JBOD subsystem is shown in Figure 1. The environmental service monitor card contains the hardware and firmware for supporting enclosure management. In this example, a Fibre Channel JBOD is implemented with two environmental service monitor cards, one for each arbitrated loop. These hot-swappable cards contain the circuitry and firmware required for in-band or out-of-band communication of mission-critical diagnostic information.
Usually some form of card-to-card communication (heartbeat monitor) is provided so that each management node can monitor the activity of the node on the other loop. The environmental service monitor card performs all control associated with the JBOD hardware, such as driving LEDs, drive bypassing, power supply status, controlling locking mechanisms, checking temperature, and setting and monitoring fan speed.
Fibre Channel and SCSI do not have built-in support for enclosure management, though this is being addressed by a variety of standards and associations. The result is chaotic, leaving some systems administrators confused. One thing is certain: IT managers are demanding enclosure management in all of their devices within their storage area network (SAN) environments, including disk arrays, servers, switches, routers, and hubs.
System administrators have the difficult choice of selecting either an in-band or out-of-band protocol, or both. In a high-availability Fibre Channel arbitrated loop environment, redundant enclosure management controllers are required, one on each loop. IT managers must balance the RAS advantages of a combined in-band/out-of-band solution with the added cost, complexity, and maintenance of supporting both schemes.
The main advantage of using in-band management is that most of the infrastructure already exists. In a SAN, host bus adapters and cabling are already available and can be used to pass management information to the host. A managed JBOD contains a micro-controller for controlling internal functions and status monitoring. This status is passed via enclosure management software, often bundled with host bus adapters, to support in-band enclosure reporting. This reduces interconnect complexity and time-to-market.
There are several issues to consider with in-band enclosure management.
- In-band management systems require an additional controller with a feature set capable of connecting and operating at the in-band bus speed. The result is somewhat more complex and expensive than out-of-band approaches.
- The management controller occupies an "address" in the protocol. In a SCSI system, 1 of 16 possible addresses is allocated to the SCSI management controller. This is less of an issue with Fibre Channel because the bus supports 126 devices.
- Perhaps the most important consideration is that management communication and data transfer occur over the same interface (i.e., cable). This is not a bandwidth issue because the amount of management traffic is negligible. However, assuming a one-loop design, if the data interface fails, the system management software cannot access the system to identify the failure. In the worst case, the software might not be able to identify which system failed. IT managers need to isolate the problem to the field replaceable unit (FRU), not just to the box.
The purpose of out-of-band management is to build a redundant infrastructure, dedicated to enclosure management communication that is independent of the "in-band" data path. This allows for a high level of access to management information, regardless of the state of the individual subsystem data path. Protocol selection must balance cost, complexity, and product availability. In heterogeneous environments, an industry-standard protocol is required to integrate management communications.
Figure 2: Interaction between ipmb and icmb. The IPMB provides inside-the-box communication between management and attached peripherals, while the ICMB provides box-to-box connection between chassis.
The key advantage of out-of-band management is the high availability to management information. The primary disadvantage is the cost of building and maintaining the additional infrastructure. Although individual node cost is usually low, additional cabling and maintenance is required. This is exacerbated by subsystems located long distances from each other. If a Fibre Channel tape backup device is located across campus, it is not practical to wire an additional cable for RS-232 in parallel with the optical data link.
In complex networks, no single management approach will satisfy all requirements. In-band management is the primary mechanism for communication across the entire network, but smaller areas of out-of-band management may provide additional RAS capabilities in specific areas.
How does IPMI help?
Although vendors have implemented sophisticated management protocols within their own systems, there is no industry-standard protocol that operates over these interfaces to manage heterogeneous systems from multiple vendors. This is the problem solved by the Intelligent Platform Management Interface (IPMI).
IPMI was designed to be an out-of-band management solution aimed primarily at SANs. It is the only industry-standard out-of-band interface to include a standardized protocol for communicating management information. As such, it provides a simple hardware interface with a predefined protocol that results in inexpensive out-of-band management in cluster, SAN, LAN, and WAN environments. Although initially defined and driven by Intel, Dell, Hewlett-Packard, and NEC, many companies currently endorse IPMI.
IPMI defines three physical interfaces: IPMB, ICMB, and system. The Intelligent Platform Management Bus (IPMB) is electrically similar to the I2C bus and provides inside-the-box communication between management devices and attached peripherals. The Intelligent Chassis Management Bus (ICMB) provides the box-to-box connection between chassis at distances up to 600 feet. As many as 256 devices can share a common bus, electrically similar to RS-485, running at 19,600 baud in half-duplex mode (see Figure 2).
The system interface uses an IPMI Baseboard Management Controller (BMC) located in the central management server to gather enclosure management information via IPMB and ICMB and to pass the information to the system management software (SMS).
One key goal of IPMI is to provide management information that is completely independent of the server in which it resides. In this way, management information about the server is available regardless of the operational state of the server. This independence provides the highest level of availability to management information, even if the server is not powered. The BMC is normally powered separately from the rest of the server for this purpose.
IPMI in servers
An IPMI BMC is located in each server and coordinates management functions within the server, as well as communication with externally connected IPMI devices. The BMC is separate from the rest of the server functions and is normally powered continuously and independently from the server so that management communication is enabled all the time. To reduce board space and system cost, the BMC should contain all of the IPMI interfaces in a single device.
The connection to the server is over three possible system interfaces: a keyboard controller style (KCS) interface (an Intel 8742 type that includes four register sets, each one-byte wide), server management interface chip (SMIC) (a three-port, byte-based interface), or a higher-performance block transfer interface (with three I/O ports and a buffered scheme for transferring data).
Some form of nonvolatile memory is attached to or included within the BMC for storing firmware, a system event log, sensor data records, and information about the BMC and server. These elements make up a basic BMC, which can be augmented by vendor-specific features. BMCs from several vendors are available with firmware to support managed server clusters.
IPMI in SANs
Many enclosure management controllers are provided for SAN environments with direct attachment to interfaces such as Fibre Channel and SCSI. However, some controllers also include on-chip support for IPMI for out-of-band communication, fault tolerance, and redundancy.
Figure 3 depicts a demonstration in which a LAN server contains both a Fibre Channel and a SCSI host bus adapter. These cards communicate to a storage system emulated with three boards, which allows management information to be sent across Fibre Channel or SCSI. Each board supports IPMI through an ICMB port linked to a remote server.
IPMI in telecom systems
CompactPCI has become a popular interface in telecom equipment. IPMI is the management interface selected by the PICMG consortium for CompactPCI designs.
IPMI-based Baseboard Management Controllers can be located on Compact-PCI CPU cards, while IPMI Enclosure Management Controllers, used as a satellite device, can be located on CompactPCI peripheral cards (Eurocards). This allows for internal monitoring and communication between cards. External access via ICMB allows telecom equipment to be seamlessly linked with a server cluster and SAN to provide one universal management network.
IPMI is an industry-standard interface and protocol for out-of-band enclosure services, aimed at high-availability server clusters, SANs, LANs, and WANs. IPMI is meant to complement, not replace, existing in-band management standards. IT managers should consider requiring IPMI as a standard interface on systems purchased for these applications. As the desire for management crosses boundaries between vendors and networks, IPMI is positioned to meet the requirements of server, storage, LAN, and WAN environments.
Tom Brokaw is a senior product marketing manager and Glen Koziuk is a senior applications engineer in the SAN products group at Vitesse Semiconductor Corp. (www.vitesse.com), headquartered in Camarillo, CA.
For more information about IPMI
As the primary promoter of IPMI, Intel maintains a Web site containing background information and all specifications relating to IPMI: http://developer.intel.com/design/servers/ipmi.