Re: [PATCH] generic/623: fix test for runing on overlayfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 1, 2022 at 9:26 AM Zorro Lang <zlang@xxxxxxxxxx> wrote:
>
> On Tue, May 31, 2022 at 12:42:01PM +0300, Amir Goldstein wrote:
> > On Tue, May 31, 2022 at 11:59 AM Zorro Lang <zlang@xxxxxxxxxx> wrote:
> > >
> > > On Tue, May 31, 2022 at 07:58:48AM +1000, Dave Chinner wrote:
> > > > On Mon, May 30, 2022 at 07:17:42PM +0300, Amir Goldstein wrote:
> > > > > On Mon, May 30, 2022 at 4:29 PM Zorro Lang <zlang@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Mon, May 30, 2022 at 02:29:05PM +0300, Amir Goldstein wrote:
> > > > > > > For this test to run on overlayfs we open a different file to perform
> > > > > > > shutdown+fsync while keeping the writeback target file open.
> > > > > > >
> > > > > > > We should probably perform fsync on the writeback target file, but
> > > > > > > the bug is reproduced on xfs and overlayfs+xfs also as is.
> > > > > > >
> > > > > > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx>
> > > > > > > ---
> > > > > > >
> > > > > > > Zorro,
> > > > > > >
> > > > > > > I tested that this test passes for both xfs and overlayfs+xfs on v5.18
> > > > > > > and tested that both configs fail with the same warning on v5.10.109.
> > > > > > >
> > > > > > > I tried changing fsync to syncfs for the test to be more correct in the
> > > > > > > overlayfs case, but then golden output of xfs and overlayfs+xfs differ
> > > > > > > and that would need some more output filtering (or disregarding output
> > > > > > > completely).
> > > > > > >
> > > > > > > Since this minimal change does the job and does not change test behavior
> > > > > > > on xfs on any of the tested kernels, I thought it might be good enough.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Amir.
> > > > > > >
> > > > > > >  tests/generic/623 | 5 ++++-
> > > > > > >  1 file changed, 4 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/tests/generic/623 b/tests/generic/623
> > > > > > > index ea016d91..bb36ad25 100755
> > > > > > > --- a/tests/generic/623
> > > > > > > +++ b/tests/generic/623
> > > > > > > @@ -24,10 +24,13 @@ _scratch_mount
> > > > > > >  # XFS had a regression where it failed to check shutdown status in the fault
> > > > > > >  # path. This produced an iomap warning because writeback failure clears Uptodate
> > > > > > >  # status on the page.
> > > > > > > +# For this test to run on overlayfs we open a different file to perform
> > > > > > > +# shutdown+fsync while keeping the writeback target file open.
> > > >
> > > > To trigger the original bug, the post-shutdown fsync needs to be run
> > > > on the original file. That triggers a writeback error writeback
> > > > which clears the uptodate state on the mapped page. The mwrite that
> > > > follows then trips over the state of the page and attempts IO
> > > > operations without first checking shutdown state.
> > > >
> > > > Hence moving the fsync to a different file will break the mechanism
> > > > the regression test uses to trigger the original bug.
> > > >
> > > > > > >  file=$SCRATCH_MNT/file
> > > > > > >  $XFS_IO_PROG -fc "pwrite 0 4k" -c fsync $file | _filter_xfs_io
> > > > > > >  ulimit -c 0
> > > > > > > -$XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" -c shutdown -c fsync \
> > > > > > > +$XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" \
> > > > > > > +     -c "open $(_scratch_shutdown_handle)" -c shutdown -c fsync \
> > > > > >
> > > > > > Did you try to reproduce the original bug which this test case covers?
> > > > > >
> > > > >
> > > > > Yes. As I wrote:
> > > > > "tested that both configs fail with the same warning on v5.10.109"
> > > > > Meaning the same bug that the test triggered before my change
> > > > > in v5.10 is still triggered on xfs in v5.10 and it is triggered on both
> > > > > xfs and overlayfs+xfs in v5.10 with my change.
> > > >
> > > > It reproduced on 5.10, but not because of the reasons you are
> > > > suggesting.
> > > >
> > > > >
> > > > > > According to the "man xfs_io":
> > > > > >
> > > > > >        open [[ -acdfrstRTPL ] path ]
> > > > > >               Closes the current file, and opens the file specified by path instead.
> > > > >
> > > > > The documentation is incorrect.
> > > > > Current file is not closed.
> > > >
> > > > It is not closed, but it's also not the target of subsequent file
> > > > operations until you use "file 0" to switch back to it...
> > >
> > > I checked the xfsprogs/io/open.c, it truely doesn't "Close the current file",
> > > (need to update the man-page?) and it looks like make all opened files as a
> > > list, then operate them one by one(?):
> > >
> > >   execve("/usr/sbin/xfs_io", ["xfs_io", "-c", "mmap 0 4k", "-c", "mwrite 0 4k", "-c", "open testfile", "-c", "fsync", "-c", "mwrite 0 4k", "testfile"], 0x7ffdc2285b78 /* 42 vars */) = 0
> > >   ...
> > >   openat(AT_FDCWD, "testfile", O_RDWR)    = 3
> > >   ...
> > >   mmap(NULL, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED, 3, 0) = 0x7efc04548000
> > >   ...
> > >   openat(AT_FDCWD, "testfile", O_RDWR)    = 4
> > >   ...
> > >   fsync(3)                                = 0
> > >   fsync(4)                                = 0
> > >   exit_group(0)                           = ?
> > >   +++ exited with 0 +++
> > >
> > > Anyway, looks like we're stuck at here.
> >
> > We are not stuck at all.
> > It is very simple to fix.
> > As you can see from the strace above and from Dave explanation,
> > The test as I posted in v1 already does the right thing, because
> > it does fsync the original file.
> > Which explains why the test I posted DOES work as expected on
> > both xfs and overlayfs and DOES reproduce the bug.
> >
> > However, if we want to write the test in a way that does NOT assume
> > that xfs_io -c fsync operates on all the open files, I need to add some
> > more commands:
> >
> > $XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" \
> >         -c "open $(_scratch_shutdown_handle)" -c shutdown \
> >         -c "file 0" -c fsync \
> >         -c "mwrite 0 4k" $file | _filter_scratch | _filter_xfs_io
> >
> > The only problem is this complicates the golden output
> > which does not produce the same output for overlayfs and xfs
> > open files, so I need to write another filter for that.
> >
> > The question is, is it worth it?
> >
> > Are we ever going to change xfs_io -c fsync to behave
> > differently and fsync only the current file?
> >
> > If not, that I can just change the comment and keep the command
> > and golden output:
> >
> >  # path. This produced an iomap warning because writeback failure
> > clears Uptodate
> >  # status on the page.
> >  # For this test to run on overlayfs we open a different file to perform
> > -# shutdown+fsync while keeping the writeback target file open.
> > +# shutdown while keeping the writeback target file open.
> > +# xfs_io -c fsync post-shutdown performs fsync also on the writeback
> > target file,
> > +# which is critical for triggering the writeback failure.
> >  file=$SCRATCH_MNT/file
> >  $XFS_IO_PROG -fc "pwrite 0 4k" -c fsync $file | _filter_xfs_io
> >  $XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" \
> >         -c "open $(_scratch_shutdown_handle)" -c shutdown -c fsync \
> >         -c "mwrite 0 4k" $file | _filter_xfs_io
>
> OK, if you persist, my personal opinion is: at least don't change the
> original test logic of other fs especially when they're not wrong but
> the change is a little tricky.
>
> For example (not tested):
>
>   shutdown_cmd="shutdown"
>   # shutdown the underlying fs if there is
>   shutdown_handle="$(_scratch_shutdown_handle)"
>   if [ "$shutdown_handle" != "$SCRATCH_MNT" ];then
>           shutdown_cmd="\"open $shutdown_handle\" -c shutdown -c close"
>   fi
>   $XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" -c ${shutdown_cmd}
>           -c fsync -c "mwrite 0 4k" $file | _filter_xfs_io
>

Ok, that works for me, but the extra close changes golder output
for overlayfs:

$ xfs_io -f x -c "open y" -c close
[000] x              (foreign,non-sync,non-direct,read-write)

So either we add more logic to filter that out or we just leave out the close
making the test in overlayfs depends on the fsync-all behavior of xfs_io
and leaving the test logic on xfs unchanged per your proposal.

If you are ok with that, I will test the version that you proposed and re-post.

Thanks,
Amir.



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux