Welcome to the inaugural article of our latest feature. “Metrics and Measurements That Matter” will look at and comment on various benchmark, workload and other performance, availability, capacity, energy and economic claims with a focus around those that involve storage and I/O. In the course of these conversations (feel free to comment and get involved in the discussion so it truly is a discussion), I will also look at various metrics, measurements, tools and techniques with a focus of “So What?” In other words, “so what” do the benchmarks, workload simulation or claims mean for different audiences. “So what” was done with the benchmark or simulation to get to the results or claims, and “so what,” do the results matter? For those who want to take a step back, “Moving Beyond the Benchmark Brouhaha” is an article I wrote a while ago about benchmarks, metrics and measurements that still matter today.
There are many server, storage and I/O benchmarks and workload simulations along with associated tools that can be used for different purposes. For example, empirical data, whether it be actual usage metrics or measurements, can be collected to establish a baseline for future analysis, comparison, planning and troubleshooting purposes. Another example is workload simulation based on industry standards, de facto standards or vendor-specific benchmarks using different test suites, packages, tools or utilities.
Workload simulations or benchmarks are often used for characterizing and comparing how different products, systems, solutions or services support various applications or functionality. At the risk of stating the obvious: Benchmarks, workload simulation and other measurements are also often used for marketing and specmanship purposes to compare and contrast among competitors. Some of these comparisons are practical and relevant to what your environment is doing or interested in, while others make for good copy or fodder for the industry at large, while still others can be simply out of this world claims.
Which Storage And I/O Benchmarks Matter?
When it comes to storage and I/O benchmarks and metrics that matter, there is no right or wrong tool, simulation package, test suite or measurements for that matter. Rather, it is what is applicable to your specific needs and requirements. For example, if you are focused on disk storage system comparisons looking at block-based access (e.g., iSCSI, SAS, FC and FCoE) doing random IOPS then storage performance council (SPC-1) comes to mind or for larger sequential data transfers, SPC-2 can be a fit for some environments.
The following are among the most popular benchmark options:
- DVD Store simulation suite
- Storage Performance Council (SPC)
- Transaction Processing Council (TPC)
- SPEC test suite site
- Microsoft SQLIO test site
- Microsoft Exchange Solution Reviewed Proven (ESRP)
Interested in file-based or NAS access (e.g., NFS and Windows CIFS)? Then SPEC sfs might be a good comparison. Other options are more application oriented, such as those for Oracle, MySQL or Microsoft SQLserver, not to mention DVD store, which exercises an entire I/O path form server to storage. We cannot forget about the TPC suite of various simulation tools for Microsoft Exchange, the Microsoft Exchange Solution Reviewed Proven (ESRP) site.
Tip: If there is a particular storage vendor or product that you are looking for performance results and cannot find elsewhere, check out the ESRP site. You will have to do some analysis work; however, the info is there to work with, if you know how and what to do with it.
Additional Ways To Obtain Storage Benchmarks
In addition to public benchmarks like those mentioned above, results are achieved by testing services affiliated with different publication (web or print) venues. Some vendors also do extensive in-house testing using a combination of their own and public tools, such as iometer, SPC and TPC; however, they keep results internal. If you encounter a vendor whose results or systems characteristics you would like to know more about, simply ask. The vendor might require that you go under NDA to see the results; however, if they are important to you for sizing, planning, analysis or other comparisons, then why not do what you have to do to view the information?
Note that I routinely agree to and respect NDA material involving benchmarking, simulation, test and related content for analysis and other purposes. Thus, anything you read here will be based on publicly available information.
Something that comes up in different benchmarks is tricks and techniques for optimizing to get better results than what might be seen in the real world or your environment. Two of those that can be used, in particular with SPC results, are discounted pricing (which is within SPC guidelines) and sparsely allocated storage capacity — what some call short stroking. Some vendors will use a discounted price for the configuration tested, which can give them a lower cost per IOP or transaction or bandwidth compared to their competitors who use list price. This is actually good news for decision makers. Knowing a published discounted price sets the point from which you should start your negations, as opposed to negotiating to that price. The flip side is that vendors, sales staffs, and channel or VAR partners may not be happy about those discounts being published and talked about, and they should think twice then before using them.
The other common technique for optimizing benchmarks such as SPCs is what the industry generally calls short stroking or underutilized storage capacity configuration. For random activity, if you can keep the hard disk drive (HDD) heads closer to the inside tracks rather than traveling to the outer tracks, the better the performance. Likewise, if you can stay on track on the outer tracks (e.g., there are more of them) for sequential operations, in theory you could see better performance. Thus, with benchmarking the trick can be to not use all of the available storage capacity to simply gain the benefit of having multiple HDD heads doing IO.
To be fair, classic short stroking was implemented in firmware or microcode either in the drive itself or in the storage controller. This meant that as an example, a 500GB HDD would have appeared only as a 125GB HDD to the storage controller, system or appliance. Today, the trick is to use the larger HDDs and then allocate only a percentage of the available space. Another trick with storage systems that can do placement is to force data to inner or outer tracks. This can be fine if you as a customer have that need and can do so in a cost-effective manner for your own applications.
Current generation short stroking simply means having a large capacity HDD that is part of a RAID group or stripe set for performance. However, keep the amount of space small to boost what is known as locality of reference. Note that locality of reference has another benefit — making better use of cache to boost effectiveness vs. utilization. What this means is that for a benchmark, it is not about utilization. Rather, it is about effectiveness of the cache, HDDs, controllers and other components to meet specific test objectives.
The ironic thing about boosting performance using more HDDs only partially spaced optimized is actually justifying the need for SSD-based storage capacity in those types of storage systems. This blog posting that I wrote explains why SSD in storage arrays, systems and appliances can be a good idea.
Needless to day, when it comes to metrics and measurements that matter along with benchmarks, workload simulations and test tools, there will be plenty to discuss.