On Thu, May 19, 2022 at 05:06:07PM +0100, Matthew Wilcox wrote: > > Right, but that's the personal perspective of an expert tester. I don't > particularly want to build that expertise myself; I want to write patches > which touch dozens of filesystems, and I want to be able to smoke-test > those patches. Maybe xfstests or kdevops doesn't want to solve that > problem, but that would seem like a waste of other peoples time. Willy, For your use case I'm guessing that you have two major concerns: * bugs that you may have introduced when "which touch dozens of filesystems" * bugs in the core mm and fs-writeback code which may be much more substantive/complex changes. Would you say that is correct? At least for ext4 and xfs, it's probably quite sufficient just to run the -g auto group for the ext4/4k and xfs/4k test configs --- that is the standard default file system configs using the 4k block size. Both of these currently don't require any test exclusions for kvm-xfstests or gce-xfstests when running the auto group. And so for the purposes of catching bugs in the core MM/VFS layer and any changes that the folio patches are likely to touch for ext4 and xfs, it's the auto group for ext4/4k and xfs/4k is probably quite sufficient. Testing the more exotic test configs, such as bigalloc for ext4, or realtime for xfs, or the external log configs, are not likely to be relevant for the folio patches. Note: I recommend that you skip using the loop device xfstests strategy, which Luis likes to advocate. For the perspective of *likely* regressions caused by the Folio patches, I claim they are going to cause you more pain than they are worth. If there are some strange Folio/loop device interactions, they aren't likely going to be obvious/reproduceable failures that will cause pain to linux-next testers. While it would be nice to find **all** possible bugs before patches go usptream to Linus, if it slows down your development velocity to near-standstill, it's not worth it. We have to be realistic about things. What about other file systems? Well, first of all, xfstests only has support for the following file systems: 9p btrfs ceph cifs exfat ext2 ext4 f2fs gfs glusterfs jfs msdos nfs ocfs2 overlay pvfs2 reiserfs tmpfs ubifs udf vfat virtiofs xfs {kvm,gce}-xfstests supports these 16 file systems: 9p btrfs exfat ext2 ext4 f2fs jfs msdos nfs overlay reiserfs tmpfs ubifs udf vfat xfs kdevops has support for these file systems: btrfs ext4 xfs So realistically, you're not going to have *full* test coverage for all of the file systems you might want to touch, no matter what you do. And even for those file systems that are technically supported by xfstests and kvm-xfstests, if they aren't being regularly run (for example, exfat, 9p, ubifs, udf, etc.) there may be bitrot and very likely there is no one actively *to* maintain exclude files. For that matter, there might not be anyone you could turn to for help interpreting the test results. So.... I believe the most realistic thing is to do is to run xfstests on a simple set of configs --- using no special mkfs or mount options --- first against the baseline, and then after you've applied your folio patches. If there are any new test failures, do something like: kvm-xfstests -c f2fs/default -C 10 generic/013 to check to see whether it's a hard failure or not. If it's a hard failure, then it's a problem with your patches. If it's a flaky failure, it's possible you'll need to repeat the test against the baseline: git checkout origin; kbuild kvm-xfstests -c f2fs/default -C 10 generic/013 If it's also flaky on the baseline, you can ignore the test failure for the purposes of folio development. There are more complex things you could do, such as running a baseline set of tests 500 times (as Luis suggests), but I believe that for your use case, it's not a good use of your time. You'd need to speed several weeks finding *all* the flaky tests up front, especially if you want to do this for a large set of file systems. It's much more efficient to check if a suspetected test regression is really a flaky test result when you come across them. I'd also suggest using the -g quick tests for file systems other than ext4 and xfs. That's probably going to be quite sufficient for finding obvious problems that might be introduced when you're making changes to f2fs, btrfs, etc., and it will reduce the number of potential flaky tests that you might have to handle. It should be possible to automate this, and Leah and I have talked about designs to automate this process. Leah has some rough scripts that do a semantic-style diff for the baseline and after applying the proposed xfs backports. So it operates on something like this: f2fs/default: 868 tests, 10 failures, 217 skipped, 6899 seconds Failures: generic/050 generic/064 generic/252 generic/342 generic/383 generic/502 generic/506 generic/526 generic/527 generic/563 In theory, we could also have automated tools that look for the suspected test regressions, and then try running those test regressions 20 or 25 times on the baseline and after applying the patch series. Those don't exist yet, but it's just a Mere Matter of Programming. :-) I can't promise anything, especially with dates, but developing better automation tools to support the xfs stable backports is on our near-term roadmap --- and that would probably be applicable for for folio development usecase. Cheers, - Ted