[LSF/MM/BPF TOPIC] automating file system benchmarks

"Theodore Y. Ts'o" <tytso@xxxxxxx> · Thu, 12 Dec 2019 20:47:42 -0500

I'd like to have a discussion at LSF/MM about making it easier and
more accessible for file system developers to run benchmarks as part
of their development processes.

My interest in this was sparked a few weeks ago, when there was a
click-bait article published on Phoronix, "The Disappointing Direction
Of Linux Performance From 4.16 To 5.4 Kernels"[1], wherein the author
published results which seem to indicate a radical decrease in
performance in a pre-5.4 kernel, which showed the 5.4(-ish) kernel
performance four times worse on a SQLite test.

[1] https://www.phoronix.com/scan.php?page=article&item=linux-416-54&num=1

I tried to reproduce this, and trying to replicate the exact
benchmark, I decided to try using the Phoronix Test Suite (PTS).
Somewhat to my surprise, it was well documented[2], straightforward to
set up, and a lot of care was put into being able to get repeatable
results from running a large set of benchmarks.  And so I added
support[3] for running to my gce-xfstests test automation framework.

[2] https://www.phoronix-test-suite.com/documentation/phoronix-test-suite.html
[3] https://github.com/tytso/xfstests-bld/commit/b8236c94caf0686b1cfacb1348b5a46fa1f52f48

Fortunately, using a controlled set kernel configs it I could find no
evidence of a massive performance regression a few days before 5.4 was
released by Linus.  These results were reproduced by Jan Kara using mmtests.

Josef Bacik added a fio benchmark to xfstests in late 2017[4], and
this was discussed at the 2018 LSF/MM.  Unfortunately, there doesn't
seem to have been any additional work to add benchmarking
functionality to xfstests.

[4] https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/commit/?id=e0d95552fdb2948c63b29af4a8169a2027f84a1d

In addition to using xfstests, I have started using PTS to as a way to
sanity check patch submissions to ext4.  I've also started
investigating using mmtests as well; mmtests isn't quite as polished
and well documented, but has better support for running running
monitoring scripts (e.g., iostat, perf, systemtap, etc.) in parallel
with running benchmarks as workloads.

I'd like to share what I've learned, and also hopefully learn what
other file system developers have been using to automate measuring
file system performance as a part of their development workflow,
especially if it has been packaged up so other people can more easily
replicate their findings.

Cheers,

							- Ted