Re: sharing fstests results (Was: [PATCH 5.15 CANDIDATE v2 0/8] xfs stable candidate patches for 5.15.y (part 1))

Luis Chamberlain <mcgrof@xxxxxxxxxx> · Sat, 25 Jun 2022 12:35:50 -0700

On Sat, Jun 25, 2022 at 10:28:32AM +0300, Amir Goldstein wrote:
> On Sat, Jun 25, 2022 at 1:54 AM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote:
> > Determinism is important for tests though so snapshotting a reflection
> > interpretion of expunges at a specific point in time is also important.
> > So the database would need to be versioned per updates, so a test is
> > checkpointed against a specific version of the expunge db.
> 
> Using the terminology "expunge db" is wrong here because it suggests
> that flakey tests (which are obviously part of that db) should be in
> expunge list as is done in kdevops and that is not how Josef/Ted/Darrick
> treat the flakey tests.

There are flaky tests which can cause a crash, and that is why I started
to expunge these. Not all flaky tests cause a crash though. And so, this
is why in the format I suggested you can specify metadata such as if a
test caused a crash.

At this point I agree that the way kdevops simply skips flaky test which
does not cause a crash should be changed, and if the test is just known
to fail though non deterministically but without a crash it would be
good then at the end to simply not treat that failure as fatal. If
however the failure rate does change it would be useful to update that
information. Without metadata one cannot process that sort of stuff.

> The discussion should be around sharing fstests "results" not expunge
> lists. Sharing expunge lists for tests that should not be run at all
> with certain kernel/disrto/xfsprogs has great value on its own and I
> this the kdevops hierarchical expunge lists are a very good place to
> share think *determinitic* information, but only as long as those lists
> absolutely do not contain non-deterministic test expunges.

The way the expunge list is process could simply be modified in kdevops
so that non-deterministic tests are not expunged but also not treated as
fatal at the end. But think about it, the exception is if the non-deterministic
failure does not lead to a crash, no?

> > > It might perhaps be useful to get a bit more clarity about how we
> > > expect the shared results would be used, because that might drive some
> > > of the design decisions about the best way to store these "results".
> >
> 
> As a requirement, what I am looking for is a way to search for anything
> known to the community about failures in test FS/NNN.

Here's the thing though. Not all developers have incentives to share.
For a while SLE didn't have public expunges, that changed after OpenSUSE
Leap 15.3 as it has binary compatibility with SLE15.3 and so the same
failures on workflows/fstests/expunges/opensuse-leap/15.3/ are applicable/.
It is up to each distro if they wish to share and without a public
vehicle to do so why would they, or how would they?

For upstream and stable I would hope there is more incentives to share.
But again, no shared home ever had existed before. And I don't think
there was ever before dialog about sharing a home for these.

> Because when I get an alert on a possible regression, that's the fastest
> way for me to triage and understand how much effort I should put into
> the investigation of that failure and which directions I should look into.
> 
> Right now, I look at the test header comment and git log, I grep the
> kdepops expunge lists to look for juicy details and I search lore for
> mentions of that test.
> 
> In fact, I already have an auto generated index of lore fstests
> mentions in xfs patch discussions [1] that I just grep for failures found
> when testing xfs. For LTS testing, I found it to be the best way to
> find candidate fix patches that I may have missed.

This effort is valuable and thanks for doing all this.

> Going forward, we can try to standardize the search and results
> format, but for getting better requirements you first need users!

As you are witness to it, running fstests against any fs takes a lot of
time and patience, and as I have noted, not many have incentives to
share. So the best I could do is provide the solution to enable folks to
reproduce testing as fast and as easy as possible and let folks who are
interested to share, to do so. And obvioulsy at least I did get a major
enterprise distro to share some results. Hope others could follow.

So I expect the format for sharing then to be lead by those who have a
clear incentive to do so. Folks working on upstream or stable stakeholders
seem like an obvious candidates. And then it is just volunteer work.

  Luis