What’s The Best Storage Benchmark?

Posted on October 12, 2015 By Greg Schulz

RssImageAltText

So what is the best storage benchmark?

Is it Aerospike for NoSQL, AS3AP, ATTO disk bench, Benchmark Factory, Blackmagic, blktrace, Bonnie or perhaps Intel COSBench for object storage?

How about Ceph scbench, Crystal Disk Mark, dd (yes, basic dd for sequential reads, writes), Dedisbench (for testing dedupe), DVDstore, EMC All Flash Array (AFA) workload script (using Vdbench), Filebench, FIO or Hadoop/HDFS based dfsio, teragen and terasoft?

What about HammerDB, HCIbench leveraging Vdbench for VMware converged CI/HCI/CiB environments, Iometer, IOR, Iorate, Iozone and Iperf, along with Login VSI or from MDTEST, Microsoft ESRP, Diskspd, Jet or PCS?

Then there is Netmist, Oracle (Dtrace, Orion, SLOB, Vdbench), PCmark, SLOB, SNIA Emerald and hotband workload scripts for Vdbench, SQLIO, SPC, SPEC, swift-bench, Sysbench, Various TPC (B, C, D, E, H etc) and VMware VMmark among other tools.

Some tools are free or at least have a free community version. Others are available for a fee to access the full function versions while some are only available to partners, some are only internal or private.

The best storage, or server, or network, or hardware, or software or cloud benchmark tool is the one that matters and is relevant to your environment and application, or area of focus. If you can’t run your actual applications at scale, then what can you do to resemble that as close as possible? Likewise, the best and most important metrics are those that are relevant to what you are doing.

In short, what is your focus (what are you testing)?

Is your focus holistic, converged and composite of an entire solution? Or is it centered on a component or particular layer in the server storage I/O stack? Perhaps your focus is how storage interacts and performs with I/O networks, servers, drivers and various layers of software, including operating systems, hypervisors, file systems, databases and related workloads?

Other focus areas include:

·   Is your focus on just storage, such as a storage system, component or device?

·   Or is your focus on how a storage system or device behaves with different applications?

·   Your focus might be on the entire solution, converged, or non-converged among others.

·   Are you approaching as a storage (hardware or software), server, network, or as a database person?

What’s your objective (why are you testing)?

·   Design and verification testing (DVT)

·   Devices or solution stress testing

·   Application integration and verification

·   Validate vendor or marketing claims

·   Competitive bakeoff of comparisons

·   Interoperability and compatibility

·   Resiliency and error handling

·   Gain insight, awareness, learn how things work

Getting back to what’s the best benchmark, technology also ties to the technique of how the tool will be used (or abused). Some tools do a good job at some things, while others are more extensible depending on how you use or configure them. For example, using Vdbench (among others) I can simulate various workloads doing raw or cooked block and file system I/Os from a single or multiple servers. Likewise, I can use Vdbench to do NAS file type I/O activity from large to small files using different scripts or configuration options.

There are many different scripts floating around for Vdbench among other tools. For example, there are the SNIA Emerald aka hot-band scripts; EMC has some for testing all flash arrays (AFA), and VMware recently came out with the HCIbench kit. HCIbench is an OVA that enables you to quickly download, install and run a Vdbench powered workload against converged infrastructure (CI) and hyper-converged infrastructure (HCI). For those who want to experiment with tools such as Vdbench, here are some simple scripts, the first one is a file that gets called to create several 5GB sized files in a file system, then run for 10 hours doing 90% reads (128KB sized) with the results logged to a directory.

VdbenchFSBigTest.txt

# Sample script for big files testing

fsd=fsd1,anchor=H:,depth=1,width=5,files=20,size=5G

fwd=fwd1,fsd=fsd1,rdpct=90,xfersize=128k,fileselect=random,fileio=random,threads=64

rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=10h,interval=30

vdbench -f VdbenchFSBigTest.txt -m 16 -o Results_FSbig_H_060615

The second script creates many small files and then does 90% reads for 10 hours.

VdbenchFSSmallTest.txt

# Sample script for small files testing

fsd=fsd1,anchor=H:,depth=1,width=64,files=25600,size=16k

fwd=fwd1,fsd=fsd1,rdpct=90,xfersize=1k,fileselect=random,fileio=random,threads=64

rd=rd1,fwd=fwd1,fwdrate=max,format=yes,elapsed=10h,interval=30

 vdbench -f VdbenchFSSmallTest.txt -m 16 -o Results_FSsmall_H_060615

You can also put the pieces together yourself. However, what’s your focus, are you testing the drives or storage attached to a CI or HCI, or the entire solution? If the entire solution, then why not go a step further, install MySQL or SQL Server or your favorite database, assuming you are doing database work. Then run your own application collecting results, or run something like sysbench, HammerDB or better yet, one of my favorites, Benchmark Factory to generate the workload. You can run pre-defined TPC (different tools support different workloads) tests, or your own include recorded traces while getting extensive reporting with Benchmark Factory (there is a free and premium version).

What about monitoring performance or activity?

There are many different tools available for collecting data and metrics that matter while a test or workload is running, as well as for gaining insight to setup a test configuration. These include among others those found in operating systems or hypervisors, as well as third-party and open source tools such as Colasoft Capsa, Dell (Spotlight on Windows (or SQL among others), foglight), Esxtop, Htop, Iotop, Microsoft Perfmon, SAR, Solarwinds, Visualesxtop among many others.

Look for:

·   Test results that are applicable to and repeatable for your environment.

·   Context for tests including configuration settings of hardware, software and workload.

·   Tools that you can get access to for free, full or community editions.

·   Metrics that matter providing relevant measurements and reporting.

·   Benchmarks that are relevant to your environment and applications.

·   Have different benchmark and monitoring tools in your toolbox.

·   Know what tool and technique to use when, and in conjunction with other tools

So, what is the storage benchmark? The one that is relevant and matters for your environment.

Ok, nuff said, for now…

Photo courtesy of Shutterstock.


Comment and Contribute
(Maximum characters: 1200). You have
characters left.