Re: [PATCH RFC] generic/730: ensure EIO after device delete

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Sat, Mar 16, 2024 at 01:08:57PM +0000, Filipe Manana wrote:
> On Fri, Mar 15, 2024 at 11:30 PM Boris Burkov <boris@xxxxxx> wrote:
> >
> > On Thu, Mar 14, 2024 at 07:56:28PM -0700, Darrick J. Wong wrote:
> > > On Wed, Mar 13, 2024 at 04:47:47PM -0700, Boris Burkov wrote:
> > > > This test removes a SCSI debug device out from under a mounted
> > > > filesystem with a (probably) dirty file. This assumes that page cache
> > > > cannot save us from EIO, for a reason that I can't quite explain. In
> > > > fact, this test fails for exactly that reason, at least on btrfs.
> > > >
> > > > The original patches:
> > > >
> > > >       https://lore.kernel.org/fstests/20230807112100.GB15405@xxxxxx/
> > > >
> > > > refer to this passing on xfs and not btrfs, so I suspect I am missing
> > > > something. With that said, on my machine this actually fails on xfs with
> > > > and without my patch, so this is clearly not enough.
> > > >
> > > > High level, I am trying to understand what is really the expected
> > > > behavior from a filesystem under this condition and what this test is
> > > > getting at. Of btrfs, ext4, and xfs, only ext4 passes it, while btrfs
> > > > does pass with this additional syncing/cache dropping to nudge it to an
> > > > error.
> > >
> > > Does btrfs prefetch pagecache data as soon as a file opens?  Or I guess
> > > it could be that xfs trips an IO error and shuts down, and
> > > xfs_file_read_iter will return EIO after that happens.
> > >
> > > --D
> > >
> >
> > I might be wildly misunderstanding, since I'm very far from an expert on
> > the page cache, but didn't we just do a buffered write to the file? So it
> > would make sense for it to be in cache?
> >
> > At any rate, what I observed on my system was that xfs does not pass
> > this test. Curious if that is only on my system or just the current
> > state of the test.
> 
> Instead of xfs, do you mean btrfs?
> 
> For me xfs and ext4 always pass this test, i.e. the cat command fails
> with -EIO, both with and without your patch.
> 
> On the other hand, btrfs and f2fs always fail without your patch, and
> pass with your patch applied.

$ sudo ./check generic/730
FSTYP         -- xfs (non-debug)
PLATFORM      -- Linux/x86_64 v0 6.8.0-rc7+ #205 SMP PREEMPT_DYNAMIC Thu Mar 14 16:44:32 PDT 2024
MKFS_OPTIONS  -- -f /dev/mapper/tst-scr0
MOUNT_OPTIONS -- /dev/mapper/tst-scr0 /mnt/scratch

generic/730 1s ... [failed, exit status 1]- output mismatch (see /home/vmuser/fstests/results//generic/730.out.bad)
    --- tests/generic/730.out   2023-11-01 22:14:50.011000000 +0000
    +++ /home/vmuser/fstests/results//generic/730.out.bad       2024-03-16 14:38:02.431000000 +0000
    @@ -1,2 +1 @@
     QA output created by 730
    -cat: -: Input/output error
    ...
    (Run 'diff -u /home/vmuser/fstests/tests/generic/730.out /home/vmuser/fstests/results//generic/730.out.bad'  to see the entire diff)
Ran: generic/730
Failures: generic/730
Failed 1 of 1 tests

I suspect it has something to do with my Kconfig. Perhaps "xfs (non-debug)" is a clue?

> 
> From a btrfs point of view, it makes sense for the cat command to
> succeed, since all the file data is in the page cache.

That was my intuition as well, as I mentioned above.

> 
> I don't know why the cat command fails (-EIO) with xfs and ext4, and I
> haven't looked at any of their source code (my knowledge of those
> filesystem's internals is too little anyway).
> 
> An alternative to your patch is just to make the write with direct IO
> by passing -d to $XFS_IO_PROG. It makes the test succeed on these 4
> filesystems.
> 
> 

I like this idea a lot more than my patch!

> >
> > Summary of my test matrix:
> > ext4: passes with and without this patch
> > btrfs: fails without this patch, passes with it
> > xfs: fails with and without this patch
> >
> > For that reason, I don't think this a good patch, I just mostly want to
> > figure out where to go with this test!
> >
> > Thanks,
> > Boris
> >
> > > > Signed-off-by: Boris Burkov <boris@xxxxxx>
> > > > ---
> > > >  tests/generic/730 | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/tests/generic/730 b/tests/generic/730
> > > > index 11308cdaa..ca5037c57 100755
> > > > --- a/tests/generic/730
> > > > +++ b/tests/generic/730
> > > > @@ -47,6 +47,9 @@ exec 3< $SCSI_DEBUG_MNT/testfile
> > > >  # delete the scsi debug device while it still has dirty data
> > > >  echo 1 > /sys/block/$(_short_dev $SCSI_DEBUG_DEV)/device/delete
> > > >
> > > > +sync
> > > > +echo 3 > /proc/sys/vm/drop_caches
> > > > +
> > > >  # try to read from the file, which should give us -EIO
> > > >  cat <&3 > /dev/null
> > > >
> > > > --
> > > > 2.43.0
> > > >
> > > >
> >




[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux