On Sat, Jun 04, 2022 at 11:38:35AM +0300, Amir Goldstein wrote: > On Sat, Jun 4, 2022 at 6:53 AM Leah Rumancik <leah.rumancik@xxxxxxxxx> wrote: > > > > From: Leah Rumancik <lrumancik@xxxxxxxxxx> > > > > This first round of patches aims to take care of the easy cases - patches > > with the Fixes tag that apply cleanly. I have ~30 more patches identified > > which will be tested next, thanks everyone for the various suggestions > > for tracking down more bug fixes. No regressions were seen during > > testing when running fstests 3 times per config with the following configs: Leah, It is great to see this work move forward. How many times was fstest run *without* the patches to establish the baseline? Do you have a baseline for known failures published somewhere? For v5.10.y effort we aimed for 100 times so to ensure we have a high confidence in the baseline. That baseline is here: https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/5.10.105/xfs/unassigned For XFS the latest baseline we are tracking on kdevops is v5.17 and you can see the current results here: https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/5.17.0-rc7/xfs/unassigned This passed 100 loops of fstests already. The target "test steady state" of 100 is set in kdevops using CONFIG_KERNEL_CI_STEADY_STATE_GOAL=100. As discussed at LSFMM is there a chance we can collaborate on a baseline together? One way I had suggested we could do this for different test runners is to have git subtree with the expunges which we can all share for different test runner. The configuration used is dynamically generated for the target test dev and pool, but the rest is pretty standard: https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/fstests/templates/xfs/xfs.config Hearing that only 3 loops of running fstests is run gives me a bit of concern for introducing a regression with a low failure rate. I realize that we may be limited in resources to test running fstests in a loop but just 3 tests should take a bit over a day. I think we can do better. At the very last you can give me your baseline and I can try to confirm if matches what I see. Then, 30 patches seems like a lot, so I think it would be best to add patches to stable 10 at a time max. Luis