On Fri, Feb 08, 2019 at 02:17:26PM -0800, Luis Chamberlain wrote:
On Fri, Feb 08, 2019 at 01:06:20AM -0500, Sasha Levin wrote:
Sure! Below are the various configs this was run against. There were
multiple runs over 48+ hours and no regressions from a 4.14.17 baseline
were observed.
In an effort to consolidate our sections:
[default]
TEST_DEV=/dev/nvme0n1p1
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/nvme0n1p2"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)
MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
This matches my "xfs" section.
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/nve0n1p3
FSTYP=xfs
[default]
TEST_DEV=/dev/nvme0n1p1
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/nvme0n1p2"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)
MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1,'
This matches my "xfs_reflink"
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/nvme0n1p3
FSTYP=xfs
[default]
TEST_DEV=/dev/nvme0n1p1
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/nvme0n1p2"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)
MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=1024,'
This matches my "xfs_reflink_1024" section.
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/nvme0n1p3
FSTYP=xfs
[default]
TEST_DEV=/dev/nvme0n1p1
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/nvme0n1p2"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)
MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0,'
This matches my "xfs_nocrc" section.
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/nvme0n1p3
FSTYP=xfs
[default]
TEST_DEV=/dev/nvme0n1p1
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/nvme0n1p2"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)
MKFS_OPTIONS='-f -m crc=0,reflink=0,rmapbt=0, -i sparse=0, -b size=512,'
This matches my "xfs_nocrc_512" section.
USE_EXTERNAL=no
LOGWRITES_DEV=/dev/nvme0n1p3
FSTYP=xfs
[default_pmem]
TEST_DEV=/dev/pmem0
I'll have to add this to my framework. Have you found pmem
issues not present on other sections?
Originally I've added this because the xfs folks suggested that pmem vs
block exercises very different code paths and we should be testing both
of them.
Looking at the baseline I have, it seems that there are differences
between the failing tests. For example, with "MKFS_OPTIONS='-f -m
crc=1,reflink=0,rmapbt=0, -i sparse=0'", generic/524 seems to fail on
pmem but not on block.
TEST_DIR=/media/test
SCRATCH_DEV_POOL="/dev/pmem1"
SCRATCH_MNT=/media/scratch
RESULT_BASE=$PWD/results/$HOST/$(uname -r)-pmem
MKFS_OPTIONS='-f -m crc=1,reflink=0,rmapbt=0, -i sparse=0'
OK so you just repeat the above options vervbatim but for pmem.
Correct?
Right.
Any reason you don't name the sections with more finer granularity?
It would help me in ensuring when we revise both of tests we can more
easily ensure we're talking about apples, pears, or bananas.
Nope, I'll happily rename them if there are "official" names for it :)
FWIW, I run two different bare metal hosts now, and each has a VM guest
per section above. One host I use for tracking stable, the other host for
my changes. This ensures I don't mess things up easier and I can re-test
any time fast.
I dedicate a VM guest to test *one* section. I do this with oscheck
easily:
./oscheck.sh --test-section xfs_nocrc | tee log-xfs-4.19.18+
For instance will just test xfs_nocrc section. On average each section
takes about 1 hour to run.
We have a similar setup then. I just spawn the VM on azure for each
section and run them all in parallel that way.
I thought oscheck runs everything on a single VM, is it a built in
mechanism to spawn a VM for each config? If so, I can add some code in
to support azure and we can use the same codebase.
I could run the tests on raw nvme and do away with the guests, but
that loses some of my ability to debug on crashes easily and out to
baremetal.. but curious, how long do your tests takes? How about per
section? Say just the default "xfs" section?
I think that the longest config takes about 5 hours, otherwise
everything tends to take about 2 hours.
I basically run these on "repeat" until I issue a stop order, so in a
timespan of 48 hours some configs run ~20 times and some only ~10.
IIRC you also had your system on hyperV :) so maybe you can still debug
easily on crashes.
Luis