Hard disk drives have become larger and larger over the years, but their rotation speeds remain at nearly the same level since decades. This has lead to some odd trait: we have seen greatly improved transfer rates for sequential input/output, but random input/output remains at nearly the same level since ever.
The reason for this is the physical build-up of hard disk drives: to read (or write) some random position on the magnetic layer, a drive needs to move its heads to a given track and wait until the requested sector arrives. Typical mean access times for this are in the range from 5ms to 15ms, resulting in 50-150 random input/output operations per second (IOPS).
In practice, there are some measures to deal with this constraint. Modern hard drives utilize native command queuing (NCQ) to optimize seek times, we use disk arrays to spread the io load on multiple spindles (RAID), various caching strategies reduce the amount of input/output operations that are issued to the drive and read-ahead/prefetching tries to load the data beforehand.
Though, the question arises: how many input/output operations can we actually perform with all these optimizations in place? Let’s benchmark this. One tool that we can use for this is iops(1), a benchmark utility that runs on Linux/FreeBSD/Mac OS X. Iops issues random read requests with increasing blocksizes:
$ sudo ./iops --num_threads 1 --time 2 /dev/md1 /dev/md1, 6.00 TB, 1 threads: 512 B blocks: 43.9 IO/s, 21.9 KiB/s (179.8 kbit/s) 1 KiB blocks: 46.7 IO/s, 46.7 KiB/s (382.9 kbit/s) 2 KiB blocks: 46.4 IO/s, 92.7 KiB/s (759.6 kbit/s) 4 KiB blocks: 37.5 IO/s, 150.0 KiB/s ( 1.2 Mbit/s) 8 KiB blocks: 33.6 IO/s, 268.5 KiB/s ( 2.2 Mbit/s) 16 KiB blocks: 29.5 IO/s, 471.4 KiB/s ( 3.9 Mbit/s) 32 KiB blocks: 26.0 IO/s, 833.3 KiB/s ( 6.8 Mbit/s) 64 KiB blocks: 24.0 IO/s, 1.5 MiB/s ( 12.6 Mbit/s) 128 KiB blocks: 24.1 IO/s, 3.0 MiB/s ( 25.3 Mbit/s) 256 KiB blocks: 20.1 IO/s, 5.0 MiB/s ( 42.1 Mbit/s) 512 KiB blocks: 18.5 IO/s, 9.3 MiB/s ( 77.6 Mbit/s) 1 MiB blocks: 16.9 IO/s, 16.9 MiB/s (142.0 Mbit/s) 2 MiB blocks: 11.7 IO/s, 23.3 MiB/s (195.7 Mbit/s) 4 MiB blocks: 9.2 IO/s, 36.6 MiB/s (307.3 Mbit/s) 8 MiB blocks: 5.1 IO/s, 41.0 MiB/s (343.6 Mbit/s) 16 MiB blocks: 3.8 IO/s, 60.8 MiB/s (510.2 Mbit/s) 32 MiB blocks: 3.1 IO/s, 100.6 MiB/s (843.7 Mbit/s) 64 MiB blocks: 2.0 IO/s, 127.2 MiB/s ( 1.1 Gbit/s) 128 MiB blocks: 1.1 IO/s, 141.7 MiB/s ( 1.2 Gbit/s) 256 MiB blocks: 0.5 IO/s, 136.1 MiB/s ( 1.1 Gbit/s)
In this example, the tested device is a Linux software raid5 with four 2 TB, 5.400rpm disks. We have started iops(1) with a single thread and a sampling time of two seconds for each block size. The results show that we reach about 45 IOPS for very small block sizes (or 22ms per IO request).
Now, let’s increase the number of threads and see how this affects overall performance:
$ sudo ./iops --num_threads 16 --time 2 /dev/md1 /dev/md1, 6.00 TB, 16 threads: 512 B blocks: 151.4 IO/s, 75.7 KiB/s (620.3 kbit/s) 1 KiB blocks: 123.7 IO/s, 123.7 KiB/s ( 1.0 Mbit/s) 2 KiB blocks: 117.0 IO/s, 234.1 KiB/s ( 1.9 Mbit/s) 4 KiB blocks: 97.7 IO/s, 390.6 KiB/s ( 3.2 Mbit/s) 8 KiB blocks: 78.6 IO/s, 629.1 KiB/s ( 5.2 Mbit/s) 16 KiB blocks: 60.7 IO/s, 970.7 KiB/s ( 8.0 Mbit/s) caught ctrl-c, bye.
We see that concurrent requests increase the IO limit to 150 IOPS. This indicates that the requests are actually spread to multiple spindles or optimized by native command queuing. I guess it’s the spindles, but we could investigate further by benchmarking a single disk instead of the array. Though, this is beyond the scope of this blog post.