The exponential increase in drive density, the incorporation of performance acceleration technologies, and the adoption of data reduction tools such as compression and deduplication are changing the perceived and actual cost of storage systems. Traditional purchasing decisions made on a dollar-per-raw-gigabyte basis may lead buyers to spending more than they presume simply because they are only buying capacity without consideration for other necessary attributes. These attributes include reliability, security, efficiency, performance, and manageability. When buying a storage system, consider the following:
–The cost per raw gigabyte is a poor metric when purchasing a system to support performance-driven applications. It is imperative that the organization understands what applications will be hosted on the storage array and what their performance and capacity requirements will be over time. Having an understanding of the environment ahead of time will make the soliciting and evaluation of options simpler.
–Performance acceleration technologies may have a direct impact on the ability of the storage system to deliver higher level of IOPS or throughput. Technologies such as compression or deduplication may have a similar impact on capacity and its utilization. The benefits must be viewed in light of any additional costs associated with deploying them and the actual results that can be expected based on the actual applications and data to be used.
–Advantages of acceleration and capacity optimization technologies may come at a cost in complexity that translates to a storage system’s manageability. Added complexity or reduction in manageability of the system over time may lead to unexpected costs in people, time, productivity, and competitiveness.
The storage industry has been trending along Moore’s predicted line of growth with HDD capacity increasing and the cost per gigabyte decreasing. Since 2000, the average HDD capacity growth has been 36% while the cost per gigabyte has been declining at an average of 34.6%. During the same period, the typical seek time has only improved 5% to 10%. And HDDs still come in 5.4K, 7.2K, 10K, or 15K RPM versions. The result are drives that have increased in density from 18GB, 36GB, and 73GB drives at a maximum of 15K RPM in 2000 to 600GB 15K RPM drives in 2009. Managers often are forced into array layouts that waste capacity in order to spread the I/O over multiple spindles.
The need for speed
Not all workloads or applications require high performance as measured in IOPS and throughput. Those workloads that do require performance though, often represent the most mission critical applications within an organization. These are applications that are tied to revenue generating activities, productivity, or cost containment. Examples may be applications responsible for transactions on the web, management of manufacturing operations, or for analytics that drive timely business decisions. It is essential for an organization to know which applications fall into this category and what data requires timely access as well as security, availability, and integrity.
The best practices of the past decade have stated that in order to increase I/O and throughput performance of storage, particularly in a write-intensive application, it is necessary to increase the number of spindles being used. This common methodology stems from the generally accepted performance characteristics of HDD. Note: The overall performance of a storage array will depend on the sum of all its components, including storage controller, cache, and HDD.
7,200 RPM HDD 80 IOPS
10,000 RPM HDD 120 IOPS
15,000 RPM HDD 170 IOPS
SSD > 10,000 IOPS
The need for speed has no direct link to the need for storage capacity. In recent years, the density of drives has been increasing exponentially while drive rotation speeds haven’t changed. The result: An application that needs performance will have more capacity provisioned for it than it may require.
The reality today is that companies buying performance end up paying a capacity penalty. If the only way to evaluate the cost of a storage system is on a per-gigabyte basis, then the buyer might be falling victim to an apples and oranges comparison. Additionally, paying for capacity while buying performance may create unexpected inefficiencies that result in unaccounted costs such as power consumption, data center floor space, data management licenses priced based on system raw capacity, maintenance and support, and management overhead.
Buy what you need, pay for what you get
It is easy and straightforward to compare systems based on price per capacity. System A has 20TB and costs $60K, or $3K per TB. System B has 30TB and costs $60K, or $2,000 per TB. System B is less expensive. Unfortunately, the diverse workloads in today’s dynamic data center demand not only capacity but IOPS, throughput, reliability, efficient power consumption, manageability, and scalability. If what the application requires is IOPS and throughput, it only makes sense to compare systems based on price per unit of performance. The same system A may include a performance acceleration technology such as an SSD drive in its configuration with dynamic storage tiering. System B is a straight Fibre Channel drive configuration. Since even one SSD drive can deliver more than 10K IOPS, the price per IOPS of system A would be significantly lower than that of system B.
Evaluating based on performance is not as simple as capacity, and it is an imperfect science. A number of acceleration and capacity optimization technologies may impact the price of the overall system and the capacity and performance it can deliver. Software offerings that are priced based on the overall capacity of the system can contribute unnecessary costs, as can support and maintenance services. Over the life of the storage system — three to five years on average — the cumulative effect of over-provisioning capacity can be significant. It is thus important for end users comparing systems to understand the absolute performance and capacity of the system for a given price point, and then consider how flexible, scalable, and resilient the system can be.