Re: [PATCH RFC] generic/730: ensure EIO after device delete

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Sat, Mar 16, 2024 at 2:39 PM Boris Burkov <boris@xxxxxx> wrote:
>
> On Sat, Mar 16, 2024 at 01:08:57PM +0000, Filipe Manana wrote:
> > On Fri, Mar 15, 2024 at 11:30 PM Boris Burkov <boris@xxxxxx> wrote:
> > >
> > > On Thu, Mar 14, 2024 at 07:56:28PM -0700, Darrick J. Wong wrote:
> > > > On Wed, Mar 13, 2024 at 04:47:47PM -0700, Boris Burkov wrote:
> > > > > This test removes a SCSI debug device out from under a mounted
> > > > > filesystem with a (probably) dirty file. This assumes that page cache
> > > > > cannot save us from EIO, for a reason that I can't quite explain. In
> > > > > fact, this test fails for exactly that reason, at least on btrfs.
> > > > >
> > > > > The original patches:
> > > > >
> > > > >       https://lore.kernel.org/fstests/20230807112100.GB15405@xxxxxx/
> > > > >
> > > > > refer to this passing on xfs and not btrfs, so I suspect I am missing
> > > > > something. With that said, on my machine this actually fails on xfs with
> > > > > and without my patch, so this is clearly not enough.
> > > > >
> > > > > High level, I am trying to understand what is really the expected
> > > > > behavior from a filesystem under this condition and what this test is
> > > > > getting at. Of btrfs, ext4, and xfs, only ext4 passes it, while btrfs
> > > > > does pass with this additional syncing/cache dropping to nudge it to an
> > > > > error.
> > > >
> > > > Does btrfs prefetch pagecache data as soon as a file opens?  Or I guess
> > > > it could be that xfs trips an IO error and shuts down, and
> > > > xfs_file_read_iter will return EIO after that happens.
> > > >
> > > > --D
> > > >
> > >
> > > I might be wildly misunderstanding, since I'm very far from an expert on
> > > the page cache, but didn't we just do a buffered write to the file? So it
> > > would make sense for it to be in cache?
> > >
> > > At any rate, what I observed on my system was that xfs does not pass
> > > this test. Curious if that is only on my system or just the current
> > > state of the test.
> >
> > Instead of xfs, do you mean btrfs?
> >
> > For me xfs and ext4 always pass this test, i.e. the cat command fails
> > with -EIO, both with and without your patch.
> >
> > On the other hand, btrfs and f2fs always fail without your patch, and
> > pass with your patch applied.
>
> $ sudo ./check generic/730
> FSTYP         -- xfs (non-debug)
> PLATFORM      -- Linux/x86_64 v0 6.8.0-rc7+ #205 SMP PREEMPT_DYNAMIC Thu Mar 14 16:44:32 PDT 2024
> MKFS_OPTIONS  -- -f /dev/mapper/tst-scr0
> MOUNT_OPTIONS -- /dev/mapper/tst-scr0 /mnt/scratch
>
> generic/730 1s ... [failed, exit status 1]- output mismatch (see /home/vmuser/fstests/results//generic/730.out.bad)
>     --- tests/generic/730.out   2023-11-01 22:14:50.011000000 +0000
>     +++ /home/vmuser/fstests/results//generic/730.out.bad       2024-03-16 14:38:02.431000000 +0000
>     @@ -1,2 +1 @@
>      QA output created by 730
>     -cat: -: Input/output error
>     ...
>     (Run 'diff -u /home/vmuser/fstests/tests/generic/730.out /home/vmuser/fstests/results//generic/730.out.bad'  to see the entire diff)
> Ran: generic/730
> Failures: generic/730
> Failed 1 of 1 tests
>
> I suspect it has something to do with my Kconfig. Perhaps "xfs (non-debug)" is a clue?

Mine is also non-debug and yet it always succeeds for me (without your
patch or any other of course):

root 17:40:02 /home/fdmanana/git/hub/xfstests > ./check generic/730
FSTYP         -- xfs (non-debug)
PLATFORM      -- Linux/x86_64 debian0 6.8.0-rc7-btrfs-next-153+ #1 SMP
PREEMPT_DYNAMIC Mon Mar  4 17:19:19 WET 2024
MKFS_OPTIONS  -- -f /dev/sdb
MOUNT_OPTIONS -- /dev/sdb /home/fdmanana/btrfs-tests/scratch_1

generic/730 2s ...  2s
Ran: generic/730
Passed all 1 tests


>
> >
> > From a btrfs point of view, it makes sense for the cat command to
> > succeed, since all the file data is in the page cache.
>
> That was my intuition as well, as I mentioned above.
>
> >
> > I don't know why the cat command fails (-EIO) with xfs and ext4, and I
> > haven't looked at any of their source code (my knowledge of those
> > filesystem's internals is too little anyway).
> >
> > An alternative to your patch is just to make the write with direct IO
> > by passing -d to $XFS_IO_PROG. It makes the test succeed on these 4
> > filesystems.
> >
> >
>
> I like this idea a lot more than my patch!
>
> > >
> > > Summary of my test matrix:
> > > ext4: passes with and without this patch
> > > btrfs: fails without this patch, passes with it
> > > xfs: fails with and without this patch
> > >
> > > For that reason, I don't think this a good patch, I just mostly want to
> > > figure out where to go with this test!
> > >
> > > Thanks,
> > > Boris
> > >
> > > > > Signed-off-by: Boris Burkov <boris@xxxxxx>
> > > > > ---
> > > > >  tests/generic/730 | 3 +++
> > > > >  1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/tests/generic/730 b/tests/generic/730
> > > > > index 11308cdaa..ca5037c57 100755
> > > > > --- a/tests/generic/730
> > > > > +++ b/tests/generic/730
> > > > > @@ -47,6 +47,9 @@ exec 3< $SCSI_DEBUG_MNT/testfile
> > > > >  # delete the scsi debug device while it still has dirty data
> > > > >  echo 1 > /sys/block/$(_short_dev $SCSI_DEBUG_DEV)/device/delete
> > > > >
> > > > > +sync
> > > > > +echo 3 > /proc/sys/vm/drop_caches
> > > > > +
> > > > >  # try to read from the file, which should give us -EIO
> > > > >  cat <&3 > /dev/null
> > > > >
> > > > > --
> > > > > 2.43.0
> > > > >
> > > > >
> > >





[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux