Re: [RFC: kdevops] Standardizing on failure rate nomenclature for expunges

Amir Goldstein <amir73il@xxxxxxxxx> · Sun, 3 Jul 2022 17:22:17 +0300

On Sun, Jul 3, 2022 at 4:15 PM Theodore Ts'o <tytso@xxxxxxx> wrote:
>
> On Sun, Jul 03, 2022 at 08:56:54AM +0300, Amir Goldstein wrote:
> >
> > That is true for some use cases, but unfortunately, the flaky
> > fstests are way too valuable and too hard to replace or improve,
> > so practically, fs developers have to run them, but not everyone does.
> >
> > Zorro has already proposed to properly tag the non deterministic tests
> > with a specific group and I think there is really no other solution.
>
> The non-deterministic tests are not the sole, or even the most likely
> cause of flaky tests.  Or put another way, even if we used a
> deterministic pseudo-random numberator seed for some of the curently
> "non-determinstic tests" (and I believe we are for many of them
> already anyway), it's not going to be make the flaky tests go away.
>
> That's because with many of these tests, we are running multiple
> threads either in the fstress or fsx, or in the antogonist workload
> that is say, running the space utilization to full to generate ENOSPC
> errors, and then deleting a bunch of files to trigger as many ENOSPC
> hitter events as possible.
>
> > The only question is whether we remove them from the 'auto' group
> > (I think we should).
>
> I wouldn't; if someone wants to exclude the non-determistic tests,
> once they are tagged as belonging to a group, they can just exclude
> that group.  So there's no point removing them from the auto group
> IMHO.

The reason I suggested that *we* change our habits is because
we want to give passing-by fs testers an easier experience.

Another argument in favor of splitting out -g soak from -g auto -
You only need to run -g soak in a loop for as long as you like to be
confident about the results.
You need to run -g auto only once per definition -
If a test ends up failing the Nth time you run -g auto then it belongs
in -g soak and not in -g auto.

>
> > filesystem developers that will run ./check -g auto -g soak
> > will get the exact same test coverage as today's -g auto
> > and the "commoners" that run ./check -g auto will enjoy blissful
> > determitic test results, at least for the default config of regularly
> > tested filesystems (a.k.a, the ones tested by kernet test bot).?
>
> First of all, there are a number of tests today which are in soak or
> long_rw which are not in auto, so "-g auto -g soak" will *not* result
> in the "exact same test coverage".

I addressed this in my proposal.
I proposed to remove these two tests out of soak and asked for
Darrick's opinion.
Who is using -g soak anyway?

>
> Secondly, as I've tested above, deterministic tests does not
> necessasrily mean determinsitic test results --- unless by
> "determinsitic tests" you mean "completely single-threaded tests",
> which would eliminate a large amount of useful test coverage.
>

To be clear, when I wrote deterministic, what I meant was deterministic
results empirically, in the same sense that Bart meant - a test should
always pass.

Because Luis was using the expunge lists to blacklist any test failure,
no matter the failure rate, the kdevops expunge lists could be used as
a first draft for -g soak group, at least for tests that are blocklisted by
kdevops for all of ext4,xfs and btrfs default configs on the upstream kernel.

Thanks,
Amir.