Category Archives: Guides

Storage Performance Benchmarking with FIO

Flexible IO tester (FIO) is an open-source synthetic benchmark tool first developed by Jens Axboe.  FIO can generate various IO workloads: sequential reads or random writes, synchronous or asynchronous, all based on the options provided by the user.  FIO provides various global options through which different type of workloads can be generated.  FIO is the easiest and versatile tool to quickly perform IO performance tests on storage system, and allows you to simulate different types of IO loads and tweak several parameters, among others, the write/read mix and the amount of processes.  I’ll likely make a few additional posts with some of the other storage benchmarking tools I’ve used, but I’m focusing on FIO for this post.  Why FIO?  It’s a great tool, and it’s pros outweigh it’s cons for me.

Pros

  • It has a batch mode and a very extensive set of parameters.
  • Unlike IOMeter, it is still being actively developed.
  • It has multi-OS support.
  • It’s free.

Cons

  • It is CLI only, there are no GUI or Graphics features.
  • It has a rather complex syntax and it takes some time to get the hang of it.

Download and Installation

FIO can be run from either Linux or Windows, although Windows will first require an installation of Cygwin.  FIO works on Linux, Solaris, AIX, HP-UX, OSX, NetBSD, OpenBSD, Windows, FreeBSD, and DragonFly.  Some features and options may only be available on some of the platforms, typically because those features only apply to that platform (like the solarisaio engine, or the splice engine on Linux).  Note that you can check github for the latest version before you get started.

You can run the following commands from a Linux server to download and install the FIO package:

cd /root

yum install -y make gcc libaio-devel || ( apt-get update && apt-get install -y make gcc libaio-dev  </dev/null )

wget https://github.com/Crowd9/Benchmark/raw/master/fio-2.0.9.tar.gz ; tar xf fio*

cd fio*

make

How to compile FIO on 64-bit Windows:

Install Cygwin (http://www.cygwin.com/). Install **make** and all     packages starting with **mingw64-i686** and **mingw64-x86_64**.

Open the Cygwin Terminal.

Go to the fio directory (source files).

Run ``make clean && make -j``.

To build fio on 32-bit Windows, run ``./configure --build-32bit-win`` before ``make``.

FIO Cheat sheet

With FIO compiled, we can now run some tests.  For reference, I’ll start off with some basic commands for simulating different types of workloads.

Sequential Reads – Async mode – 8K block size – Direct IO – 100% Reads

fio --name=seqread --rw=read --direct=1 --ioengine=libaio --bs=8k --numjobs=8 --size=1G --runtime=600  --group_reporting

Sequential Writes – Async mode – 32K block size – Direct IO – 100% Writes

fio --name=seqwrite --rw=write --direct=1 --ioengine=libaio --bs=32k --numjobs=4 --size=2G --runtime=600 --group_reporting

Random Reads – Async mode – 8K block size – Direct IO – 100% Reads

fio --name=randread --rw=randread --direct=1 --ioengine=libaio --bs=8k --numjobs=16 --size=1G --runtime=600 --group_reporting

Random Writes – Async mode – 64K block size – Direct IO – 100% Writes

fio --name=randwrite --rw=randwrite --direct=1 --ioengine=libaio --bs=64k --numjobs=8 --size=512m --runtime=600 --group_reporting

Random Read/Writes – Async mode – 16K block size – Direct IO – 90% Reads/10% Writes

fio --name=randrw --rw=randrw --direct=1 --ioengine=libaio --bs=16k --numjobs=8 --rwmixread=90 --size=1G --runtime=600 --group_reporting

Host Considerations

To avoid IOs reporting out of the host system cache, use the direct option which will directly read/write to the disk.  Use the Linux native asynchronous IO using the ioengine directive with libaio.  When FIO is launched, it will create the file with the name provided in –name to the size as provided in –size with block size as –bs.  If the –numjobs are provided, it will create the files in the format of name.n.0 where n will be between 0 and –numjobs.

–jobs = The more jobs, the higher the performance can be, based on the resource availability.  If your server is limited on the resources (TCP or FC), I’d recommend running FIO across multiple servers to push a higher workload to the storage array.

Block Size Matters

Many storage vendors will advertise performance benchmarks based on 4k block sizes, which can artificially inflate the total IO number that the array is capable of handling.  In my professional experience with the workloads I’ve supported, the most popular read size is between 32KB and 64KB and the most popular write size is between 8KB and 32KB.  VMWare-heavy environments may skew a bit lower in read block size.  Read IO is typically more common than Write IO, at a rate of around 3:1.  It’s important to know the characteristics of your workload before you begin testing, as we need to look at IO Size as a weight attached to the IO. An IO of size 64KB will have a weight 8 times higher than an IO of size 8KB since it will move 8 times as many bytes.  A 256K block has 64 times the payload of a 4K block.  Both examples take substantially more effort for every component of the storage stack to satisfy the IO request. Applications and the operating systems they run on generate a wide, ever-changing mix of block sizes based on the characteristics of the application and the workloads being serviced. Reads and writes are often delivered using different block sizes as well. Block size has a significant impact on the latency your applications see.

Try to understand the IO size distributions of your workload and use those IO size modalities when you develop your FIO test commands. If a single IO size is a requirement for a quick rule-of-thumb comparison, then 32KB has been a pretty reasonable number for me to use, as it is a logical convergence of the weighted IO size distribution of most of the shared workload arrays I’ve supported. Your mileage may vary, of course.

Because block sizes have different effects on different storage systems, visibility into this metric is critical. The storage fabric, the protocol, the processing overhead on the HBAs, the switches, the storage controllers, and the storage media are all affected by it.

General Tips on Testing

Work on large datasets.  Your dataset should be at least double the amount of RAM in the OS.  For example, if the OS RAM is 16GB, test 32GB datasets multiplied by the number of CPU cores.

The Rule of Thumb:  75/25.  Although it really depends on your workloads, typically the rule of thumb is that there are 25% writes and 75% reads on the dataset.

Test from small to large blocks of I/O.  Consider testing small blocks of I/O up to large blocks of I/O in the following order: 512 bytes, 4K, 16K, 64K, 1MB to get proper measurement that can be the visualized as a histogram. This makes it easier to interpret.

Test multiple workload patterns.  Not everything is sequential read/write. Test all scenarios: read / write, write only, read only, random read / random write, random read only, and random write only.

Sample Output

Here’s a sample command string for FIO that includes many of the command switches you’ll want to use.  Each parameter can be tweaked to your specific environment.  It creates 8 files (numjobs=8) each with size 512MB (size) at 64K block size (bs=64k) and will perform random read/write (rw=randrw) with the mixed workload of 70% reads and 30% writes. The job will run for full 5 minutes (runtime=300 & time_based) even if the files were created and read/written.

[root@server1 fio]# fio --name=randrw --ioengine=libaio --iodepth=1 --rw=randrw --bs=64k --direct=1 --size=512m --numjobs=8 --runtime=300 --group_reporting --time_based --rwmixread=70 randrw: (g=0): rw=randrw, bs=64K-64K/64K-64K/64K-64K, ioengine=libaio, iodepth=1

Output:

 Starting 8 processes

 randrw: Laying out IO file(s) (1 file(s) / 512MB)
 randrw: Laying out IO file(s) (1 file(s) / 512MB)
 randrw: Laying out IO file(s) (1 file(s) / 512MB)
 randrw: Laying out IO file(s) (1 file(s) / 512MB)
 randrw: Laying out IO file(s) (1 file(s) / 512MB)
 randrw: Laying out IO file(s) (1 file(s) / 512MB)
 randrw: Laying out IO file(s) (1 file(s) / 512MB)
 randrw: Laying out IO file(s) (1 file(s) / 512MB)
 Jobs: 8 (f=8): [mmmmmmmm] [2.0% done] [252.0MB/121.3MB/0KB /s] [4032/1940/0 iops] [eta 04m:55s]
randrw: (groupid=0, jobs=8): err= 0: pid=31900: Mon Jun 13 01:01:08 2016
 read : io=78815MB, bw=269020KB/s, iops=4203, runt=300002msec
 slat (usec): min=6, max=173, avg= 9.99, stdev= 3.63
 clat (usec): min=430, max=23909, avg=1023.31, stdev=273.66
 lat (usec): min=447, max=23917, avg=1033.46, stdev=273.78
 clat percentiles (usec):
 | 1.00th=[ 684], 5.00th=[ 796], 10.00th=[ 836], 20.00th=[ 892],
 | 30.00th=[ 932], 40.00th=[ 964], 50.00th=[ 996], 60.00th=[ 1032],
 | 70.00th=[ 1080], 80.00th=[ 1128], 90.00th=[ 1208], 95.00th=[ 1288],
 | 99.00th=[ 1560], 99.50th=[ 2256], 99.90th=[ 3184], 99.95th=[ 3408],
 | 99.99th=[13888]
 bw (KB /s): min=28288, max=39217, per=12.49%, avg=33596.69, stdev=1709.09
 write: io=33899MB, bw=115709KB/s, iops=1807, runt=300002msec
 slat (usec): min=7, max=140, avg=11.42, stdev= 3.96
 clat (usec): min=1246, max=24744, avg=2004.11, stdev=333.23
 lat (usec): min=1256, max=24753, avg=2015.69, stdev=333.36
 clat percentiles (usec):
 | 1.00th=[ 1576], 5.00th=[ 1688], 10.00th=[ 1752], 20.00th=[ 1816],
 | 30.00th=[ 1880], 40.00th=[ 1928], 50.00th=[ 1976], 60.00th=[ 2040],
 | 70.00th=[ 2096], 80.00th=[ 2160], 90.00th=[ 2256], 95.00th=[ 2352],
 | 99.00th=[ 2576], 99.50th=[ 2736], 99.90th=[ 4256], 99.95th=[ 4832],
 | 99.99th=[16768]
 bw (KB /s): min=11776, max=16896, per=12.53%, avg=14499.30, stdev=907.78
 lat (usec) : 500=0.01%, 750=1.61%, 1000=33.71%
 lat (msec) : 2=50.35%, 4=14.27%, 10=0.04%, 20=0.02%, 50=0.01%
 cpu : usr=0.46%, sys=1.60%, ctx=1804510, majf=0, minf=196
 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, &gt;=64=0.0%
 submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, &gt;=64=0.0%
 complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, &gt;=64=0.0%
 issued : total=r=1261042/w=542389/d=0, short=r=0/w=0/d=0
 latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
 READ: io=78815MB, aggrb=269020KB/s, minb=269020KB/s, maxb=269020KB/s, mint=300002msec, maxt=300002msec
 WRITE: io=33899MB, aggrb=115708KB/s, minb=115708KB/s, maxb=115708KB/s, mint=300002msec, maxt=300002msec

Additional Samples

I’ll run through an additional set of simple examples of using FIO as well using different workload patterns.

Random read/write performance

If you want to compare disk performance with a simple 3:1 4K read/write test, use the following command:

./fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75

This command string create a 4 GB file and perform 4KB reads and writes using a 75%/25% split within the file, with 64 operations running at a time. The 3:1 ratio represents a typical database.

The output is below, with the IO numbers highlighted in red.

Jobs: 1 (f=1): [m] [100.0% done] [43496K/14671K /s] [10.9K/3667 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=31214: Fri May 9 16:01:53 2014
read : io=3071.1MB, bw=39492KB/s, iops=8993 , runt= 79653msec
write: io=1024.7MB, bw=13165KB/s, iops=2394 , runt= 79653msec
cpu : usr=16.26%, sys=71.94%, ctx=25916, majf=0, minf=25
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=786416/w=262160/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=3071.1MB, aggrb=39492KB/s, minb=39492KB/s, maxb=39492KB/s, mint=79653msec, maxt=79653msec
WRITE: io=1024.7MB, aggrb=13165KB/s, minb=13165KB/s, maxb=13165KB/s, mint=79653msec, maxt=79653msec
Disk stats (read/write):
vda: ios=786003/262081, merge=0/22, ticks=3883392/667236, in_queue=4550412, util=99.97%

This tests shows the array performed 8993 read operations per second and 2394 write operations per second.

Random read performance

To measure random reads, we’ll change FIO command a bit:

./fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randread

Output:

Jobs: 1 (f=1): [r] [100.0% done] [62135K/0K /s] [15.6K/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=31181: Fri May 9 15:38:57 2014
read : io=1024.0MB, bw=62748KB/s, iops=19932 , runt= 16711msec
cpu : usr=5.94%, sys=90.13%, ctx=1885, majf=0, minf=89
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=262144/w=0/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=1024.0MB, aggrb=62747KB/s, minb=62747KB/s, maxb=62747KB/s, mint=16711msec, maxt=16711msec
Disk stats (read/write):
vda: ios=259063/2, merge=0/1, ticks=951356/20, in_queue=951308, util=96.83%

This test shows the storage array performing 19,932 read operations per second.

Random write performance

Modify the FIO command slightly to use randwrite instead of randread for the random write test.

./fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randwrite

Output:

Jobs: 1 (f=1): [w] [100.0% done] [0K/26326K /s] [0 /6581 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=31235: Fri May 9 16:16:21 2014
write: io=1024.0MB, bw=29195KB/s, iops=5434, runt= 35916msec
cpu : usr=77.42%, sys=13.74%, ctx=2306, majf=0, minf=24
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=1024.0MB, aggrb=29195KB/s, minb=29195KB/s, maxb=29195KB/s, mint=35916msec, maxt=35916msec
Disk stats (read/write):
vda: ios=0/260938, merge=0/11, ticks=0/2315104, in_queue=2316372, util=98.87%

This tests shows storage scoring 5,434 write operations per second.

Advertisements

XtremIO Manual Log File Collection Procedure

If you have a need to gather XtremIO logs for EMC to analyze and they are unable to connect via ESRS, there is a method to gather them manually.  Below are the steps on how to do it.

1. Log in to the XtremIO Management System (XMS) GUI interface with the ‘admin‘ user account.

2. Click on the ‘Administration‘ tab, which is on the top of the XtremIO Management System (XMS) GUI banner bar.

3. On the left side of the Administration window, choose the ‘CLI Terminal‘ option.

4. Once you have the CLI terminal up, enter the following CLI command at the ‘xmcli (admin)>‘ prompt.  This command will generate a set of XtremIO dossier log files: create-debug-info.  Note that it may take a little while to complete.  Once the command completes and returns you to the ‘xmcli (admin)>’ prompt, a complete package of XtremIO dossier log files will be available for you to download.

Example:

xmcli (admin)> create-debug-info
The process may take a while. Please do not interrupt.
Debug info collected and could be accessed via http:// <XMS IP Address> /XtremApp/DebugInfo/104dd1a0b9f56adf7f0921d2f154329a.tar.xz

Important Note: If you have more than one cluster managed by the XMS server, you will need to select the specific cluster.

xmcli (e012345)> show-clusters

Cluster-Name Index State  Gates-Open Conn-State Num-of-Vols Num-of-Internal-Volumes Vol-Size UD-SSD-Space Logical-Space-In-Use UD-SSD-Space-In-Use Total-Writes Total-Reads Stop-Reason Size-and-Capacity

XIO-0881     1     active True       connected  253         0                       60.550T  90.959T      19.990T              9.944T              44

2.703T     150.288T    none        4X20TB

XIO-0782     2     active True       connected  225         0                       63.115T  90.959T      20.993T              9.944T              20

7.608T     763.359T    none        4X20TB

XIO-0355     3     active True       connected  6           0                       2.395T   41.111T      1.175T               253.995G            6.

251T       1.744T      none        2X40TB

xmcli (e012345)> create-debug-info cluster-id=3

5. Once the ‘create-debug-info‘ command completes, you can use a web browser to navigate to the HTTP address link that’s provided in the terminal session window.  After navigating to the link, you’ll be presented with a pop-up window asking you to save the log file package to your local machine.  Save the log file package to your local machine for later upload.

6. Attach the XtremIO dossier log file package you downloaded to the EMC Service Request (SR) you currently have open or are in the process of opening.  Use the ‘Attachments’ (the paperclip button) area located on the Service Request page for upload.

7. You also have the ability to view a historical listing of all XtremIO dossier log file packages that are currently available on your system. To view them, issue the following XtremIO CLI command: show-debug-info. A series of log file packages will be listed.  It’s possible EMC may request a historical log file package for baseline information when troubleshooting.  To download, simply reference the HTTP links listed under the ‘Output-Path‘ header and input the address into your web browser’s address bar to start the download.

Example:

xmcli (tech)> show-debug-info
 Name  Index  System-Name   Index   Debug-Level   Start-Time                 Create-Time               Output-Path
 1      XtremIO-SVT   1       medium        Mon Aug 14 15:55:10 2017   Mon Aug 14 16:09:40 2017  http://<XMS IP Address>/XtremApp/ DebugInfo/1aaf4b1acd88433e9aca5b022b5bc43f.tar.xz
 2      XtremIO-SVT   1       medium        Mon Aug 14 15:55:10 2017   Mon Aug 14 16:09:40 2017  http://<XMS IP Address>/XtremApp/ DebugInfo/af5001f0f9e75fdd9c0784c3d742531f.tar.xz

That’s it! It’s a fairly straightforward process.