Random distribution: zoned argument

Phillip Chen <phillip.a.chen@xxxxxxxxxxx> · Mon, 20 Nov 2017 10:17:40 -0700

Hello,
I'm a test engineer at Seagate and we're using FIO to gather some
performance data. This email has two parts: a bug report and a feature
request.
The bug that I'm seeing is that when using the
random_distribution=zoned argument, the zone order is not honored. So
using zoned:18/90:7/5:75/5 will not weight IO towards the end of the
disk but rather towards the beginning. Using zoned:75/5:7/5:18/90
apparently gives the same distribution, but also not the correct
distribution. I've attached a python3.6 script that shows this
behaviour. Here is the histogram information from running the two
zoned arguments as described above:

Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
--thread --filename=/dev/sdc --runtime=30 --readwrite=randread
--iodepth=1 --random_distribution=zoned:18/90:7/5:75/5 --norandommap
--output-format=terse
histogram bins = [2302, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
32, 27, 49, 34, 24, 36, 26, 184]
histogram percents = [75.99867943215582, 0.8253549026081215,
0.8253549026081215, 0.9904258831297458, 1.0894684714427203,
1.188511059755695, 0.8583690987124464, 1.0564542753383954,
0.9574116870254209, 1.2215252558600198, 0.6932981181908221,
0.6932981181908221, 1.0564542753383954, 0.8913832948167713,
1.6176956091119181, 1.1224826675470452, 0.7923407065037966,
1.188511059755695, 0.8583690987124464, 6.074612083195774]

Using fio --name=rand_reads --ioengine=libaio --direct=1 --exitall
--thread --filename=/dev/sdc --runtime=30 --readwrite=randread
--iodepth=1 --random_distribu   tion=zoned:75/5:7/5:18/90
--norandommap --output-format=terse
histogram bins = [2306, 25, 25, 30, 33, 36, 26, 32, 29, 37, 21, 21,
32, 27, 49, 34, 24, 36, 26, 184]
histogram percents = [76.03033300362677, 0.8242664029014177,
0.8242664029014177, 0.9891196834817013, 1.0880316518298714,
1.1869436201780414, 0.8572370590174745, 1.0550609957138146,
0.9561490273656446, 1.2199142762940982, 0.6923837784371909,
0.6923837784371909, 1.0550609957138146, 0.8902077151335311,
1.6155621496867787, 1.1210023079459281, 0.7912957467853611,
1.1869436201780414, 0.8572370590174745, 6.0666007253544345]

To run the script, use the -h flag to see usage, but at a minimum
you'll need to give the device handle to run on as the first argument
(the workload only does reads). The random_distribution argument is
set at the top of the file.

Here is my environment information:
# cat /etc/centos-release
CentOS Linux release 7.3.1611 (Core)
# uname -r
3.10.0-514.21.1.el7.x86_64
I used fio-3.2-13-g40e5f which was the newest version I could see as of today.

As for the feature request:
I am trying to adapt our current FIO job files for FLEX testing which
is a new protocol we announced recently
(http://blog.seagate.com/intelligent/new-flex-dynamic-recording-method-redefines-data-center-hard-drive/)
that has some requirements on where writes/reads are allowed. I would
like to have better control where random reads and writes are going
using the zoned random_distribution setting using sector numbers
rather than capacity percentages. Would that be a possible feature to
add? Or is there an existing way to randomly read/write to
non-contiguous zones on the disk with varying sizes?

Thank you,
Phillip Chen
import re
import subprocess
import time
import sys
import math
import argparse

# Weight heavily towards the last 5% of the drive
dist_str = "zoned:18/90:7/5:75/5"

# Weights are in descending order, as in the example -- this seems to be the only way that works
# dist_str = "zoned:75/5:7/5:18/90"

# Weighted heavily towards the middle of the drive
# dist_str = "zoned:5/45:90/10:5/45"

arg_parser = argparse.ArgumentParser()
arg_parser.add_argument("drive_handle", help = "Drive handle to test")
arg_parser.add_argument("-rt", "--runtime", default = 30, help = "Time to run workload")
arg_parser.add_argument("-sbp", "--save_block_parse", action = "store_true",
                        help = "Save blockparse output to blkparse_output.txt if flag is set")
arg_parser.add_argument("-fp", "--fio_path", default = "fio",
                        help = "The path to the FIO executable to run")
args = arg_parser.parse_args()

dev_handle = args.drive_handle

blktrace = subprocess.Popen(["blktrace", dev_handle, "-o", "-"], stdout = subprocess.PIPE,
                            stderr = subprocess.PIPE)
# blktrace needs a little time to get set up
time.sleep(1)

# Start FIO job
fio_string = (args.fio_path + " --name=rand_reads --ioengine=libaio --direct=1 --exitall "
              "--thread --filename=" + dev_handle + " --runtime=" + str(args.runtime) +
              " --readwrite=randread --iodepth=1 --random_distribution=" + dist_str +
              " --norandommap --output-format=terse")
print("Running " + fio_string)
cmd_ret = subprocess.run(fio_string.split(' '), stdout = subprocess.PIPE, stderr = subprocess.PIPE)

if cmd_ret.stderr != b"":
    print("FIO errors:")
    print(cmd_ret.stderr.decode(sys.stderr.encoding))
print("FIO stdout:")
print(cmd_ret.stdout.decode(sys.stdout.encoding))

# Terminate is how blktrace expects to end, don't use kill or you'll lose commands near the end
blktrace.terminate()
try:
    stdout, stderr = blktrace.communicate(timeout = 20)
except subprocess.TimeoutExpired:
    blktrace.kill()
    stdout, stderr = blktrace.communicate()
print("blktrace errors:")
print(stderr)
# This will give you the raw blktrace output
# print(stdout)
blkparse_format_str = '%D %2c %8s %5T.%9t %5p %2a %3d command = %C sectors = %S\n'
blkparse_ret = subprocess.run(["blkparse", "-i", "-", "-f", blkparse_format_str, "-a", "issue"],
                              input = stdout, stdout = subprocess.PIPE, stderr = subprocess.PIPE)
print("blkparse errors:")
print(blkparse_ret.stderr)
# print(blkparse_ret.stdout)
blkparse_str = blkparse_ret.stdout.decode(sys.stdout.encoding)

if args.save_block_parse:
    with open("blkparse_output.txt", 'w') as output_file:
        output_file.write(blkparse_str)

# Parse blktrace result into bins
blkline_re = re.compile(r"(\d+,\d+)\s+(\d+)\s+(\d+)\s+(?P<timestamp>\d+\.\d+)\s+(\d+)\s+D\s+(R|W)"
                        r"\s+command = fio\s+sectors = (?P<sector>\d+)")
total_ios = 0
avg_lba = 0
max_lba = 0
min_lba = None
# Parse out the sectors from the blocktrace output to get some preliminary statistics
match_iter = blkline_re.finditer(blkparse_str)
for match_obj in match_iter:
    sector_num = int(match_obj.groupdict()["sector"])
    total_ios += 1
    avg_lba += sector_num
    if min_lba is None or sector_num < min_lba:
        min_lba = sector_num
    if sector_num > max_lba:
        max_lba = sector_num

print("total IOs = " + str(total_ios))
print("avg: {:.2f}, min: {}, max: {}".format(avg_lba / total_ios, min_lba, max_lba))
hist_num = 20
hist_bins = [0] * hist_num
hist_div = max_lba / hist_num
hist_edges = []
for ind in range(hist_num):
    hist_edges.append(hist_div * (ind + 1))

# Sort the data into a histogram
match_iter = blkline_re.finditer(blkparse_str)
for match_obj in match_iter:
    sector_num = int(match_obj.groupdict()["sector"])
    hist_ind = math.floor(sector_num / hist_div)
    if hist_ind == hist_num:
        hist_ind -= 1
    hist_bins[hist_ind] += 1
    # print("{}: bin {}".format(sector_num, hist_ind))

hist_perc = []
for hist_bin in hist_bins:
    hist_perc.append(100 * hist_bin / total_ios)

print("histogram bins = " + str(hist_bins))
print("histogram percents = " + str(hist_perc))
print("histogram edges = " + str(hist_edges))
# print FIO version and distribution string
print(dist_str)
cmd_ret = subprocess.run([args.fio_path, "-v"])