Re: [PATCH 5.15 00/15] xfs stable candidate patches for 5.15.y

Luis Chamberlain <mcgrof@xxxxxxxxxx> · Mon, 6 Jun 2022 08:55:24 -0700

On Sat, Jun 04, 2022 at 11:38:35AM +0300, Amir Goldstein wrote:
> On Sat, Jun 4, 2022 at 6:53 AM Leah Rumancik <leah.rumancik@xxxxxxxxx> wrote:
> >
> > From: Leah Rumancik <lrumancik@xxxxxxxxxx>
> >
> > This first round of patches aims to take care of the easy cases - patches
> > with the Fixes tag that apply cleanly. I have ~30 more patches identified
> > which will be tested next, thanks everyone for the various suggestions
> > for tracking down more bug fixes. No regressions were seen during
> > testing when running fstests 3 times per config with the following configs:

Leah,

It is great to see this work move forward.

How many times was fstest run *without* the patches to establish the
baseline? Do you have a baseline for known failures published somewhere?

For v5.10.y effort we aimed for 100 times so to ensure we have a high
confidence in the baseline. That baseline is here:

https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/5.10.105/xfs/unassigned

For XFS the latest baseline we are tracking on kdevops is v5.17 and you can
see the current results here:

https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/5.17.0-rc7/xfs/unassigned

This passed 100 loops of fstests already. The target "test steady state"
of 100 is set in kdevops using CONFIG_KERNEL_CI_STEADY_STATE_GOAL=100.

As discussed at LSFMM is there a chance we can collaborate on a baseline
together? One way I had suggested we could do this for different test
runners is to have git subtree with the expunges which we can all share
for different test runner.

The configuration used is dynamically generated for the target
test dev and pool, but the rest is pretty standard:

https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/fstests/templates/xfs/xfs.config

Hearing that only 3 loops of running fstests is run gives me a bit of
concern for introducing a regression with a low failure rate. I realize
that we may be limited in resources to test running fstests in a loop
but just 3 tests should take a bit over a day. I think we can do better.
At the very last you can give me your baseline and I can try to confirm
if matches what I see. Then, 30 patches seems like a lot, so I think it
would be best to add patches to stable 10 at a time max.

  Luis