[RFC][PATCH 0/8] Crash consistency xfstest using dm-log-writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I've collected these patches that have been sitting in Josef Bacik's
tree for a few years and kicked them a bit into shape.
The dm-log-writes target has been merged to kernel v4.1, see:
https://github.com/torvalds/linux/blob/master/Documentation/device-mapper/log-writes.txt

I have been getting frequent test failures, both fsck and file checksum
errors while testing xfs, ext4 and btrfs.  The patterns of failures
are quite different between the different file systems.
I tested on two systems, one with SSD and one with spinning disk.
I personally believe those error imply either a wrong assumptions
on I/O model that the test tools are making or a test
implementation bug.

Decided to post the patches anyway, because it may take me a while
to debug the failures, so giving other develpers a chance to produce
more test results on their systems and maybe help in debugging the
test failures.

Some data points from my tests:
- ext4 test results seem more consistent than xfs test results -
  with some random seed values I could not get ext4 to fail
  and with some random seed values, like the ones provided in the patch,
  ext4 test failed with exactly the same fsck error, on the same log
  mark on both SSD and spinning disk systems.
- With the random seed values in this patch set, ext4 test always
  failed with the same fsck error (end of extent exceeds allowed value).
- btrfs test also failed with the provided random seed values, but with
  slightly different fsck errors each run.
- Unlike ext4 and btrfs, xfs tests seemed to fail arbitrarily for any value
  of random seed I tried.
- xfs tests fail sometimes on file checksum error, each run on a different
  file and I've never seen xfs failing on fsck error.
- Tests were much more likely to fail with xfs on spinning disk (9 out of 10)
  compared to xfs on SSD (1 out of 10).
- Removing -o discard mount option, adding fsx AIO (-A) and disabling mapped
  read/write (-W -R) did not improve xfs test failures as far as I can tell

Any tips and pointers to other things I could test before diving
into tracing would be much appreciated.

If anyone can run the test to get additional data points that would
be much appreciated as well.

Thanks,
Amir.

P.S.: Josef,

Because I split the patches and made some changes, I did not keep
your S-O-B. After you review my changes, if you like, I can restore
your S-O-B.

Amir Goldstein (8):
  common/rc: convert some egrep to grep
  common/rc: fix _require_xfs_io_command params check
  fsx: fixes to random seed
  fsx: fix path of .fsx* files
  fsx: add support for integrity check with dm-log-writes target
  log-writes: add replay-log program to replay dm-log-writes target
  fstests: add support for working with dm-log-writes target
  fstests: add crash consistency fsx test using dm-log-writes

 .gitignore                   |   1 +
 README                       |   2 +
 common/dmlogwrites           |  86 ++++++++++
 common/rc                    |  15 +-
 doc/auxiliary-programs.txt   |   8 +
 doc/requirement-checking.txt |  20 +++
 ltp/fsx.c                    | 152 ++++++++++++++---
 src/Makefile                 |   2 +-
 src/log-writes/Makefile      |  23 +++
 src/log-writes/SOURCE        |   6 +
 src/log-writes/log-writes.c  | 379 +++++++++++++++++++++++++++++++++++++++++++
 src/log-writes/log-writes.h  |  70 ++++++++
 src/log-writes/replay-log.c  | 348 +++++++++++++++++++++++++++++++++++++++
 tests/generic/500            | 128 +++++++++++++++
 tests/generic/500.out        |   2 +
 tests/generic/group          |   1 +
 16 files changed, 1212 insertions(+), 31 deletions(-)
 create mode 100644 common/dmlogwrites
 create mode 100644 src/log-writes/Makefile
 create mode 100644 src/log-writes/SOURCE
 create mode 100644 src/log-writes/log-writes.c
 create mode 100644 src/log-writes/log-writes.h
 create mode 100644 src/log-writes/replay-log.c
 create mode 100755 tests/generic/500
 create mode 100644 tests/generic/500.out

-- 
2.7.4




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux