On Wed, Jun 1, 2022 at 9:26 AM Zorro Lang <zlang@xxxxxxxxxx> wrote: > > On Tue, May 31, 2022 at 12:42:01PM +0300, Amir Goldstein wrote: > > On Tue, May 31, 2022 at 11:59 AM Zorro Lang <zlang@xxxxxxxxxx> wrote: > > > > > > On Tue, May 31, 2022 at 07:58:48AM +1000, Dave Chinner wrote: > > > > On Mon, May 30, 2022 at 07:17:42PM +0300, Amir Goldstein wrote: > > > > > On Mon, May 30, 2022 at 4:29 PM Zorro Lang <zlang@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Mon, May 30, 2022 at 02:29:05PM +0300, Amir Goldstein wrote: > > > > > > > For this test to run on overlayfs we open a different file to perform > > > > > > > shutdown+fsync while keeping the writeback target file open. > > > > > > > > > > > > > > We should probably perform fsync on the writeback target file, but > > > > > > > the bug is reproduced on xfs and overlayfs+xfs also as is. > > > > > > > > > > > > > > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> > > > > > > > --- > > > > > > > > > > > > > > Zorro, > > > > > > > > > > > > > > I tested that this test passes for both xfs and overlayfs+xfs on v5.18 > > > > > > > and tested that both configs fail with the same warning on v5.10.109. > > > > > > > > > > > > > > I tried changing fsync to syncfs for the test to be more correct in the > > > > > > > overlayfs case, but then golden output of xfs and overlayfs+xfs differ > > > > > > > and that would need some more output filtering (or disregarding output > > > > > > > completely). > > > > > > > > > > > > > > Since this minimal change does the job and does not change test behavior > > > > > > > on xfs on any of the tested kernels, I thought it might be good enough. > > > > > > > > > > > > > > Thanks, > > > > > > > Amir. > > > > > > > > > > > > > > tests/generic/623 | 5 ++++- > > > > > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git a/tests/generic/623 b/tests/generic/623 > > > > > > > index ea016d91..bb36ad25 100755 > > > > > > > --- a/tests/generic/623 > > > > > > > +++ b/tests/generic/623 > > > > > > > @@ -24,10 +24,13 @@ _scratch_mount > > > > > > > # XFS had a regression where it failed to check shutdown status in the fault > > > > > > > # path. This produced an iomap warning because writeback failure clears Uptodate > > > > > > > # status on the page. > > > > > > > +# For this test to run on overlayfs we open a different file to perform > > > > > > > +# shutdown+fsync while keeping the writeback target file open. > > > > > > > > To trigger the original bug, the post-shutdown fsync needs to be run > > > > on the original file. That triggers a writeback error writeback > > > > which clears the uptodate state on the mapped page. The mwrite that > > > > follows then trips over the state of the page and attempts IO > > > > operations without first checking shutdown state. > > > > > > > > Hence moving the fsync to a different file will break the mechanism > > > > the regression test uses to trigger the original bug. > > > > > > > > > > > file=$SCRATCH_MNT/file > > > > > > > $XFS_IO_PROG -fc "pwrite 0 4k" -c fsync $file | _filter_xfs_io > > > > > > > ulimit -c 0 > > > > > > > -$XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" -c shutdown -c fsync \ > > > > > > > +$XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" \ > > > > > > > + -c "open $(_scratch_shutdown_handle)" -c shutdown -c fsync \ > > > > > > > > > > > > Did you try to reproduce the original bug which this test case covers? > > > > > > > > > > > > > > > > Yes. As I wrote: > > > > > "tested that both configs fail with the same warning on v5.10.109" > > > > > Meaning the same bug that the test triggered before my change > > > > > in v5.10 is still triggered on xfs in v5.10 and it is triggered on both > > > > > xfs and overlayfs+xfs in v5.10 with my change. > > > > > > > > It reproduced on 5.10, but not because of the reasons you are > > > > suggesting. > > > > > > > > > > > > > > > According to the "man xfs_io": > > > > > > > > > > > > open [[ -acdfrstRTPL ] path ] > > > > > > Closes the current file, and opens the file specified by path instead. > > > > > > > > > > The documentation is incorrect. > > > > > Current file is not closed. > > > > > > > > It is not closed, but it's also not the target of subsequent file > > > > operations until you use "file 0" to switch back to it... > > > > > > I checked the xfsprogs/io/open.c, it truely doesn't "Close the current file", > > > (need to update the man-page?) and it looks like make all opened files as a > > > list, then operate them one by one(?): > > > > > > execve("/usr/sbin/xfs_io", ["xfs_io", "-c", "mmap 0 4k", "-c", "mwrite 0 4k", "-c", "open testfile", "-c", "fsync", "-c", "mwrite 0 4k", "testfile"], 0x7ffdc2285b78 /* 42 vars */) = 0 > > > ... > > > openat(AT_FDCWD, "testfile", O_RDWR) = 3 > > > ... > > > mmap(NULL, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_SHARED, 3, 0) = 0x7efc04548000 > > > ... > > > openat(AT_FDCWD, "testfile", O_RDWR) = 4 > > > ... > > > fsync(3) = 0 > > > fsync(4) = 0 > > > exit_group(0) = ? > > > +++ exited with 0 +++ > > > > > > Anyway, looks like we're stuck at here. > > > > We are not stuck at all. > > It is very simple to fix. > > As you can see from the strace above and from Dave explanation, > > The test as I posted in v1 already does the right thing, because > > it does fsync the original file. > > Which explains why the test I posted DOES work as expected on > > both xfs and overlayfs and DOES reproduce the bug. > > > > However, if we want to write the test in a way that does NOT assume > > that xfs_io -c fsync operates on all the open files, I need to add some > > more commands: > > > > $XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" \ > > -c "open $(_scratch_shutdown_handle)" -c shutdown \ > > -c "file 0" -c fsync \ > > -c "mwrite 0 4k" $file | _filter_scratch | _filter_xfs_io > > > > The only problem is this complicates the golden output > > which does not produce the same output for overlayfs and xfs > > open files, so I need to write another filter for that. > > > > The question is, is it worth it? > > > > Are we ever going to change xfs_io -c fsync to behave > > differently and fsync only the current file? > > > > If not, that I can just change the comment and keep the command > > and golden output: > > > > # path. This produced an iomap warning because writeback failure > > clears Uptodate > > # status on the page. > > # For this test to run on overlayfs we open a different file to perform > > -# shutdown+fsync while keeping the writeback target file open. > > +# shutdown while keeping the writeback target file open. > > +# xfs_io -c fsync post-shutdown performs fsync also on the writeback > > target file, > > +# which is critical for triggering the writeback failure. > > file=$SCRATCH_MNT/file > > $XFS_IO_PROG -fc "pwrite 0 4k" -c fsync $file | _filter_xfs_io > > $XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" \ > > -c "open $(_scratch_shutdown_handle)" -c shutdown -c fsync \ > > -c "mwrite 0 4k" $file | _filter_xfs_io > > OK, if you persist, my personal opinion is: at least don't change the > original test logic of other fs especially when they're not wrong but > the change is a little tricky. > > For example (not tested): > > shutdown_cmd="shutdown" > # shutdown the underlying fs if there is > shutdown_handle="$(_scratch_shutdown_handle)" > if [ "$shutdown_handle" != "$SCRATCH_MNT" ];then > shutdown_cmd="\"open $shutdown_handle\" -c shutdown -c close" > fi > $XFS_IO_PROG -x -c "mmap 0 4k" -c "mwrite 0 4k" -c ${shutdown_cmd} > -c fsync -c "mwrite 0 4k" $file | _filter_xfs_io > Ok, that works for me, but the extra close changes golder output for overlayfs: $ xfs_io -f x -c "open y" -c close [000] x (foreign,non-sync,non-direct,read-write) So either we add more logic to filter that out or we just leave out the close making the test in overlayfs depends on the fsync-all behavior of xfs_io and leaving the test logic on xfs unchanged per your proposal. If you are ok with that, I will test the version that you proposed and re-post. Thanks, Amir.