Re: [PATCH 5.10 CANDIDATE 0/9] xfs stable candidate patches for 5.10.y (from v5.13+)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 27, 2022 at 07:01:15PM -0700, Darrick J. Wong wrote:
> On Wed, Jul 27, 2022 at 09:17:47PM +0200, Amir Goldstein wrote:
> > On Tue, Jul 26, 2022 at 11:21 AM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > >
> > > Darrick,
> > >
> > > This backport series contains mostly fixes from v5.14 release along
> > > with three deferred patches from the joint 5.10/5.15 series [1].
> > >
> > > I ran the auto group 10 times on baseline (v5.10.131) and this series
> > > with no observed regressions.
> > >
> > > I ran the recoveryloop group 100 times with no observed regressions.
> > > The soak group run is in progress (10+) with no observed regressions
> > > so far.
> > >
> > > I am somewhat disappointed from not seeing any improvement in the
> > > results of the recoveryloop tests comapred to baseline.
> > >
> > > This is the summary of the recoveryloop test results on both baseline
> > > and backport branch:
> > >
> > > generic,455, generic/457, generic/646: pass
> > > generic/019, generic/475, generic/648: failing often in all config
> 
> <nod> I posted a couple of patchsets to fstests@ yesterday that might
> help with these recoveryloop tests failing.
> 
> https://lore.kernel.org/fstests/165886493457.1585218.32410114728132213.stgit@magnolia/T/#t
> https://lore.kernel.org/fstests/165886492580.1585149.760428651537119015.stgit@magnolia/T/#t
> https://lore.kernel.org/fstests/165886491119.1585061.14285332087646848837.stgit@magnolia/T/#t
> 
> > > generic/388: failing often with reflink_1024
> > > generic/388: failing at ~1/50 rate for any config
> > > generic/482: failing often on V4 configs
> > > generic/482: failing at ~1/100 rate for V5 configs
> > > xfs/057: failing at ~1/200 rate for any config
> > >
> > > I observed no failures in soak group so far neither on baseline nor
> > > on backport branch. I will update when I have more results.
> > >
> > 
> > Some more results after 1.5 days of spinning:
> > 1. soak group reached 100 runs (x5 configs) with no failures
> > 2. Ran all the tests also on debian/testing with xfsprogs 5.18 and
> >     observed a very similar fail/pass pattern as with xfsprogs 5.10
> > 3. Started to run the 3 passing recoveryloop tests 1000 times and
> >     an interesting pattern emerged -
> > 
> > generic/455 failed 3 times on baseline (out of 250 runs x 5 configs),
> > but if has not failed on backport branch yet (700 runs x 5 configs).
> > 
> > And it's not just failures, it's proper data corruptions, e.g.
> > "testfile2.mark1 md5sum mismatched" (and not always on mark1)
> 
> Oh good!
> 
> 
> > 
> > I will keep this loop spinning, but I am cautiously optimistic about
> > this being an actual proof of bug fix.
> > 
> > If these results don't change, I would be happy to get an ACK for the
> > series so I can post it after the long soaking.
> 
> Patches 4-9 are an easy
> Acked-by: Darrick J. Wong <djwong@xxxxxxxxxx>

I hit send too fast.

I think patches 1-3 look correct.  I still think it's sort of risky,
but your testing shows that things at least get better and don't
immediately explode or anything. :)

By my recollection of the log changes between 5.10 and 5.17 I think the
lsn/cil split didn't change all that much, so if you get to the end of
the week with no further problems then I say Acked-by for them too.

--D

> 
> 
> --D
> 
> > Thanks,
> > Amir.



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux