On Tue, May 31, 2022 at 11:59 AM Zorro Lang <zlang@xxxxxxxxxx> wrote: > > On Tue, May 31, 2022 at 07:58:48AM +1000, Dave Chinner wrote: > > On Mon, May 30, 2022 at 07:17:42PM +0300, Amir Goldstein wrote: > > > On Mon, May 30, 2022 at 4:29 PM Zorro Lang <zlang@xxxxxxxxxx> wrote: > > > > > > > > On Mon, May 30, 2022 at 02:29:05PM +0300, Amir Goldstein wrote: > > > > > For this test to run on overlayfs we open a different file to perform > > > > > shutdown+fsync while keeping the writeback target file open. > > > > > > > > > > We should probably perform fsync on the writeback target file, but > > > > > the bug is reproduced on xfs and overlayfs+xfs also as is. > > > > > > > > > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> > > > > > --- > > > > > > > > > > Zorro, > > > > > > > > > > I tested that this test passes for both xfs and overlayfs+xfs on v5.18 > > > > > and tested that both configs fail with the same warning on v5.10.109. > > > > > > > > > > I tried changing fsync to syncfs for the test to be more correct in the > > > > > overlayfs case, but then golden output of xfs and overlayfs+xfs differ > > > > > and that would need some more output filtering (or disregarding output > > > > > completely). > > > > > > > > > > Since this minimal change does the job and does not change test behavior > > > > > on xfs on any of the tested kernels, I thought it might be good enough. > > > > > > > > > > Thanks, > > > > > Amir. > > > > > > > > > > tests/generic/623 | 5 ++++- > > > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/tests/generic/623 b/tests/generic/623 > > > > > index ea016d91..bb36ad25 100755 > > > > > --- a/tests/generic/623 > > > > > +++ b/tests/generic/623 > > > > > @@ -24,10 +24,13 @@ _scratch_mount > > > > > # XFS had a regression where it failed to check shutdown status in the fault > > > > > # path. This produced an iomap warning because writeback failure clears Uptodate > > > > > # status on the page. > > > > > +# For this test to run on overlayfs we open a different file to perform > > > > > +# shutdown+fsync while keeping the writeback target file open. > > > > To trigger the original bug, the post-shutdown fsync needs to be run > > on the original file. That triggers a writeback error writeback > > which clears the uptodate state on the mapped page. The mwrite that > > follows then trips over the state of the page and attempts IO > > operations without first checking shutdown state. > > > > Hence moving the fsync to a different file will break the mechanism > > the regression test uses to trigger the original bug. > > > > > > > file=$SCRATCH_MNT/file > > > > > $XFS_IO_PROG -fc "pwrite 0 4k" -c fsync $file | _filter_xfs_io > > > > > ulimit -c 0 > > > > > -$XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" -c shutdown -c fsync \ > > > > > +$XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" \ > > > > > + -c "open $(_scratch_shutdown_handle)" -c shutdown -c fsync \ > > > > > > > > Did you try to reproduce the original bug which this test case covers? > > > > > > > > > > Yes. As I wrote: > > > "tested that both configs fail with the same warning on v5.10.109" > > > Meaning the same bug that the test triggered before my change > > > in v5.10 is still triggered on xfs in v5.10 and it is triggered on both > > > xfs and overlayfs+xfs in v5.10 with my change. > > > > It reproduced on 5.10, but not because of the reasons you are > > suggesting. > > > > > > > > > According to the "man xfs_io": > > > > > > > > open [[ -acdfrstRTPL ] path ] > > > > Closes the current file, and opens the file specified by path instead. > > > > > > The documentation is incorrect. > > > Current file is not closed. > > > > It is not closed, but it's also not the target of subsequent file > > operations until you use "file 0" to switch back to it... > > I checked the xfsprogs/io/open.c, it truely doesn't "Close the current file", > (need to update the man-page?) and it looks like make all opened files as a > list, then operate them one by one(?): > > execve("/usr/sbin/xfs_io", ["xfs_io", "-c", "mmap 0 4k", "-c", "mwrite 0 4k", "-c", "open testfile", "-c", "fsync", "-c", "mwrite 0 4k", "testfile"], 0x7ffdc2285b78 /* 42 vars */) = 0 > ... > openat(AT_FDCWD, "testfile", O_RDWR) = 3 > ... > mmap(NULL, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED, 3, 0) = 0x7efc04548000 > ... > openat(AT_FDCWD, "testfile", O_RDWR) = 4 > ... > fsync(3) = 0 > fsync(4) = 0 > exit_group(0) = ? > +++ exited with 0 +++ > > Anyway, looks like we're stuck at here. We are not stuck at all. It is very simple to fix. As you can see from the strace above and from Dave explanation, The test as I posted in v1 already does the right thing, because it does fsync the original file. Which explains why the test I posted DOES work as expected on both xfs and overlayfs and DOES reproduce the bug. However, if we want to write the test in a way that does NOT assume that xfs_io -c fsync operates on all the open files, I need to add some more commands: $XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" \ -c "open $(_scratch_shutdown_handle)" -c shutdown \ -c "file 0" -c fsync \ -c "mwrite 0 4k" $file | _filter_scratch | _filter_xfs_io The only problem is this complicates the golden output which does not produce the same output for overlayfs and xfs open files, so I need to write another filter for that. The question is, is it worth it? Are we ever going to change xfs_io -c fsync to behave differently and fsync only the current file? If not, that I can just change the comment and keep the command and golden output: # path. This produced an iomap warning because writeback failure clears Uptodate # status on the page. # For this test to run on overlayfs we open a different file to perform -# shutdown+fsync while keeping the writeback target file open. +# shutdown while keeping the writeback target file open. +# xfs_io -c fsync post-shutdown performs fsync also on the writeback target file, +# which is critical for triggering the writeback failure. file=$SCRATCH_MNT/file $XFS_IO_PROG -fc "pwrite 0 4k" -c fsync $file | _filter_xfs_io $XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" \ -c "open $(_scratch_shutdown_handle)" -c shutdown -c fsync \ -c "mwrite 0 4k" $file | _filter_xfs_io > So I have 3 crude methods which is just > out of my mind, hope to get review:) > > 1) Skip this test for overlay directly, as this patch does. > 2) Add a _require_real_scratch_shutdown() helper, require the $FSTYP supporting > GOINGDOWN ioctl, don't fallback to the underlying fs. Then let generic/623 > use it to skip overlay and other fs which really doesn't support shutdown. > 3) Change xfsprogs/io/shutdown.c, let shutdown_f() accept an extra mountpoint > path argument, to shutdown a specified mountpoint (?? not sure if it's > reasonable for xfs_io :). Then let generic/623 require and use this feature. > All of the above are overkills. The test fix already works. Thanks, Amir.