Re: [PATCH 0/2] fstest changes for LBS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 23, 2024 at 09:21:50PM +0100, Pankaj Raghav wrote:
> On 23/01/2024 20:42, Ritesh Harjani (IBM) wrote:
> > Pankaj Raghav <p.raghav@xxxxxxxxxxx> writes:
> > 
> >>>> CCing Ritesh as I saw him post a patch to fix a testcase for 64k block size.
> >>>
> >>> Hi Pankaj,
> >>>
> >>> So I tested this on Linux 6.6 on Power8 qemu (which I had it handy).
> >>> xfs/558 passed with both 64k blocksize & with 4k blocksize on a 64k
> >>> pagesize system.
> > 
> > Ok, so it looks like the testcase xfs/558 is failing on linux-next with
> > 64k blocksize but passing with 4k blocksize.
> > It thought it was passing on my previous linux 6.6 release, but I guess
> > those too were just some lucky runs. Here is the report -
> > 
> > linux-next: xfs/558 aggregate results across 11 runs: pass=2 (18.2%), fail=9 (81.8%)
> > v6.6: xfs/558 aggregate results across 11 runs: pass=5 (45.5%), fail=6 (54.5%)
> > 
> 
> Oh, thanks for reporting back!
> 
> I can confirm that it happens 100% of time with my LBS patch enabled for 64k bs.
> 
> Let's see what Zorro reports back on a real 64k hardware.
> 
> > So I guess, I will spend sometime analyzing why the failure.
> > 
> 
> Could you try the patch I sent for xfs/558 and see if it works all the time?
> 
> The issue is 'xfs_wb*iomap_invalid' not getting triggered when we have larger
> bs. I basically increased the blksz in the test based on the underlying bs.
> Maybe there is a better solution than what I proposed, but it fixes the test.

The only improvement I can think of would be to force-disable large
folios on the file being tested.  Large folios mess with testing because
the race depends on write and writeback needing to walk multiple pages.
Right now the pagecache only institutes large folios if the IO patterns
are large IOs, but in theory that could change some day.

I suspect that the iomap tracepoint data and possibly
trace_mm_filemap_add_to_page_cache might help figure out what size
folios are actually in use during the invalidation test.

(Perhaps it's time for me to add a 64k bs VM to the test fleet.)

--D

> > Failure log
> > ================
> > xfs/558 36s ... - output mismatch (see /root/xfstests-dev/results//xfs_64k_iomap/xfs/558.out.bad)
> >     --- tests/xfs/558.out       2023-06-29 12:06:13.824276289 +0000
> >     +++ /root/xfstests-dev/results//xfs_64k_iomap/xfs/558.out.bad       2024-01-23 18:54:56.613116520 +0000
> >     @@ -1,2 +1,3 @@
> >      QA output created by 558
> >     +Expected to hear about writeback iomap invalidations?
> >      Silence is golden
> >     ...
> >     (Run 'diff -u /root/xfstests-dev/tests/xfs/558.out /root/xfstests-dev/results//xfs_64k_iomap/xfs/558.out.bad'  to see the entire diff)
> > 
> > HINT: You _MAY_ be missing kernel fix:
> >       5c665e5b5af6 xfs: remove xfs_map_cow
> > 
> > -ritesh
> > 
> >>
> >> Thanks for testing it out. I will investigate this further, and see why
> >> I have this failure in LBS for 64k and not for 32k and 16k block sizes.
> >>
> >> As this test also expects some invalidation during the page cache writeback,
> >> this might an issue just with LBS and not for 64k page size machines.
> >>
> >> Probably I will also spend some time to set up a Power8 qemu to test these failures.
> >>
> >>> However, since on this system the quota was v4.05, it does not support
> >>> bigtime feature hence could not run xfs/161. 
> >>>
> >>> xfs/161       [not run] quota: bigtime support not detected
> >>> xfs/558 7s ...  21s
> >>>
> >>> I will collect this info on a different system with latest kernel and
> >>> will update for xfs/161 too.
> >>>
> >>
> >> Sounds good! Thanks!
> >>
> >>> -ritesh




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux