On Mon, Jun 06, 2022 at 08:55:24AM -0700, Luis Chamberlain wrote: > On Sat, Jun 04, 2022 at 11:38:35AM +0300, Amir Goldstein wrote: > > On Sat, Jun 4, 2022 at 6:53 AM Leah Rumancik <leah.rumancik@xxxxxxxxx> wrote: > > > > > > From: Leah Rumancik <lrumancik@xxxxxxxxxx> > > > > > > This first round of patches aims to take care of the easy cases - patches > > > with the Fixes tag that apply cleanly. I have ~30 more patches identified > > > which will be tested next, thanks everyone for the various suggestions > > > for tracking down more bug fixes. No regressions were seen during > > > testing when running fstests 3 times per config with the following configs: > > Leah, > > It is great to see this work move forward. > > How many times was fstest run *without* the patches to establish the > baseline? Do you have a baseline for known failures published somewhere? Currently, the tests are being run 10x per config without the patches. If a failure is seen with the patches, the tests are rerun on the baseline several hundred times to see if the failure was a regression or to determine the baseline failure rate. > > For v5.10.y effort we aimed for 100 times so to ensure we have a high > confidence in the baseline. That baseline is here: > > https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/5.10.105/xfs/unassigned > > For XFS the latest baseline we are tracking on kdevops is v5.17 and you can > see the current results here: > > https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/5.17.0-rc7/xfs/unassigned > > This passed 100 loops of fstests already. The target "test steady state" > of 100 is set in kdevops using CONFIG_KERNEL_CI_STEADY_STATE_GOAL=100. > > As discussed at LSFMM is there a chance we can collaborate on a baseline > together? One way I had suggested we could do this for different test > runners is to have git subtree with the expunges which we can all share > for different test runner. > Could you elaborate on this a bit? Are you hoping to gain insight from comparing 5.10.y baseline with 5.15.y baseline or are you hoping to allow people working on the same stable branch to have a joint record of test run output? > The configuration used is dynamically generated for the target > test dev and pool, but the rest is pretty standard: > > https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/fstests/templates/xfs/xfs.config > > Hearing that only 3 loops of running fstests is run gives me a bit of > concern for introducing a regression with a low failure rate. I realize > that we may be limited in resources to test running fstests in a loop > but just 3 tests should take a bit over a day. I think we can do better. > At the very last you can give me your baseline and I can try to confirm > if matches what I see. I can go ahead and bump up the amount of test runs. It would be nice to agree on the number of test runs and the specific configs to test. For a fixed amount of resources there is a tradeoff between broader coverage through more configs vs more solid results with fewer configs. I am not sure where everyone's priorities lie. After the new runs, I'll go ahead and post the baseline and send out a link so we can compare. > Then, 30 patches seems like a lot, so I think it > would be best to add patches to stable 10 at a time max. I am planning on batching into smaller groups, 10 at a time works for me. Best, Leah