Re: [LSF/MM/BPF TOPIC] Long Duration Stress Testing Filesystems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




I attempted to study the prior art on this and so far have found:
- fsstress/fsx and the attendant tests in fstests/. There are ~150-200
   tests using fsstress and fsx in fstests/. Most of them are xfs and
   btrfs tests following the aforementioned pattern of racing fsstress
   with some scary operations. Most of them tend to run for 30s, though
   some are longer (and of course subject to TIME_FACTOR configuration)
- Similar duration error injection tests in fstests (e.g. generic/475)
- The NFSv4 Test Project
   https://www.kernel.org/doc/ols/2006/ols2006v2-pages-275-294.pdf
   A choice quote regarding stress testing:
   "One year after we started using FSSTRESS (in April 2005) Linux NFSv4
   was able to sustain the concurrent load of 10 processes during 24
   hours, without any problem. Three months later, NFSv4 reached 72 hours
   of stress under FSSTRESS, without any bugs. From this date, NFSv4
   filesystem tree manipulation is considered to be stable."


I would like to discuss:
- Am I missing other strategies people are employing? Apologies if there
   are obvious ones, but I tried to hunt around for a few days :)
- What is the universe of interesting stressors (e.g., reflink, scrub,
   online repair, balance, etc.)
It's not a filesystem, but the dm-vdo project has some similarities, doing deduplication, compression, and thin provisioning. As such, they have a fairly extensive set of tests of dm-vdo, and in particular they do a fair bit of stress testing.

For them, the universe is reboots, crashes, complete rebuilds, read-only entry and exit, compression enable/disable, and 512 byte sector mode enable/disable. They've been running about fifty hours a week of these tests inside of Red Hat. For instance, https://github.com/dm-vdo/vdo-devel/blob/main/src/perl/vdotest/VDOTest/RebuildStress03.pm is one of the tests showing the random selection of operations.

When these tests were first introduced eight years ago, they did catch some crash or data corruption bugs which were not covered by the existing universe of fstests-like tests for dm-vdo. There was also a filesystem inconsistency uncovered at the time: https://lore.kernel.org/all/CALoZfD4-uqhRSfEh0Y+v8jjSDY2KkAh-hhwdLnRgZopHEETUXA@xxxxxxxxxxxxxx/

I would suggest Matt Sakai, cc'd, or another of the VDO folks as a valuable contributor to this discussion, given the VDO folks' long experience with stress testing.

Sweet Tea




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux