On Thu, May 19, 2022 at 07:24:50PM +0800, Zorro Lang wrote: > > Yes, we talked about this, but if I don't rememeber wrong, I recommended each > downstream testers maintain their own "testing data/config", likes exclude > list, failed ratio, known failures etc. I think they're not suitable to be > fixed in the mainline fstests. Failure ratios are the sort of thing that are only applicable for * A specific filesystem * A specific configuration * A specific storage device / storage device class * A specific CPU architecture / CPU speed * A specific amount of memory available Put another way, there are problems that fail so close to rarely as to be "hever" on, say, an x86_64 class server with gobs and gobs of memory, but which can more reliably fail on, say, a Rasberry PI using eMMC flash. I don't think that Luis was suggesting that this kind of failure annotation would go in upstream fstests. I suspect he just wants to use it in kdevops, and hope that other people would use it as well in other contexts. But even in the context of test runners like kdevops and {kvm,gce,android}-xfstests, it's going to be very specific to a particular test environment, and for the global list of excludes for a particular file system. So in the gce-xfstests context, this is the difference between the excludes in the files: fs/ext4/excludes vs fs/ext4/cfg/bigalloc.exclude even if I only cared about, say, how things ran on GCE using SSD-backed Persistent Disk (never mind that I can only run gce-xfstests on Local SSD, and PD Extreme, etc.), failure percentages would never make sense for fs/ext4/excludes, since that covers multiple file system configs. And my infrastructure supports kvm, gce, and Android, as well as some people (such as at $WORK for our data center kernels) who run the test appliacce directly on bare metal, so I wouldn't use the failure percentages in these files, etc. Now, what I *do* is to track this sort of thing in my own notes, e.g: generic/051 ext4/adv Failure percentage: 16% (4/25) "Basic log recovery stress test - do lots of stuff, shut down in the middle of it and check that recovery runs to completion and everything can be successfully removed afterwards." generic/410 nojournal Couldn't reproduce after running 25 times "Test mount shared subtrees, verify the state transitions..." generic/68[12] encrypt Failure percentage: 100% The directory does grow, but blocks aren't charged to either root or the non-privileged users' quota. So this appears to be a real bug. There is one thing that I'd like to add to upstream fstests, and that is some kind of option so that "check --retry-failures NN" would cause fstests to automatically, upon finding a test failure, will rerun that failing test NN aditional times. Another potential related feature which we currently have in our daily spinner infrastructure at $WORK would be to on a test failure, rerun a test up to M times (typically a small number, such as 3), and if it passes on a retry attempt, declare the test result as "flaky", and stop running the retries. If the test repeatedly fails after M attempts, then the test result is "fail". These results would be reported in the junit XML file, and would allow the test runners to annotate their test summaries appropriately. I'm thinking about trying to implement something like this in my copious spare time; but before I do, does the general idea seem acceptable? Thanks, - Ted