Re: [PATCH] generic/459: improve shutdown/read-only check to accommodate bcachefs

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Fri, Nov 17, 2023 at 02:14:34PM -0800, Darrick J. Wong wrote:
> On Fri, Nov 17, 2023 at 09:43:17AM -0500, Brian Foster wrote:
> > generic/459 occasionally fails on bcachefs because the deliberately
> > induced I/O errors caused by exhausting the overprovisioned thin
> > pool can lead to filesystem shutdown. This test considers this
> > expected behavior on certain fs', but only checks for the ext4
> > remount read-only behavior. bcachefs does a similar emergency
> > read-only transition in response to certain I/O errors, but it
> > behaves more similar to an XFS shutdown and doesn't necessarily
> > reflect "ro" state in the mount table (unless induced by userspace).
> > 
> > Since the test already runs a touch command to help trigger the ext4
> > error handling sequence, this can be tweaked to serve double duty
> > and also more accurately detect read-only status on bcachefs.
> > Refactor into a small helper, check for an EROFS return to the touch
> > command, and consider the fs read-only if either that or the mount
> > entry check indicates it.
> > 
> > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> > ---
> > 
> > Something I realized when writing up the commit log is that the EROFS
> > check doesn't technically cover XFS, which IIRC returns EIO in response
> > to any sorts of writes once the fs has shutdown. I'm not sure this
> > matters currently because XFS doesn't shutdown due to the default
> > behavior to retry failed I/Os, but technically if XFS were configured to
> > not retry I/O errors and go right to permanent failure, I suspect it
> > would fail this test in the same way bcachefs does.
> > 
> > That could be addressed fairly easily by also checking for EIO error
> > message output, or just assuming touch failure == shutdown, etc. I don't
> > have much preference on that, so thoughts appreciated.
> 
> I wish there was a better way to signal that a filesystem has shut down,
> though ATM that isn't even a VFS level concept.  I generally assume that
> touch failure == shutdown if the fs was previously writable.
> 

Yeah, mildly annoying there was no good way to detect this. I think all
we really have atm is dmesg scraping to call out unexpected shutdowns.
That still needs to be added for bcachefs btw, but I've been holding off
because it leads to noise on various dm-flakey oriented tests and
whatnot that complain about shutdowns that otherwise seem to be expected
from bcachefs. Though perhaps the right thing to do there is to enable
it and just filter those tests out for the time being.

But that's a separate topic... It sounds reasonable to me to just use
the touch failure in this particular case. I'll post a v2 with that
tweak next week.

> OTOH with statmount landing soonish, perhaps we ought to apply for a new
> SB_SHUTDOWN state flag for it to export?
> 

Perhaps worth a discussion..? The flipside I suppose is that shutdown
has historically been a rather hacky, informalized thing with
inconsistent behavior across fs' simply because it's a last ditch
failsafe technique that we hope should never happen. Is it worth trying
to generalize/formalize/document something that is basically a "has my
filesystem crashed?" check..?

We do have the vfs GOINGDOWN ioctl. I wonder if something like a new
flag for a nomodify/check goingdown mode or something that would return
whether a shutdown would occur or already has would be sufficient... hm?

Brian

> --D
> 
> > Brian
> > 
> >  tests/generic/459 | 30 +++++++++++++++++++++++-------
> >  1 file changed, 23 insertions(+), 7 deletions(-)
> > 
> > diff --git a/tests/generic/459 b/tests/generic/459
> > index 4dd7a43b..d0c48325 100755
> > --- a/tests/generic/459
> > +++ b/tests/generic/459
> > @@ -57,6 +57,26 @@ origpsize=200
> >  virtsize=300
> >  newpsize=300
> >  
> > +# Check whether the filesystem has shutdown or remounted read-only. Behavior can
> > +# differ based on filesystem and configuration. Some fs' may not have remounted
> > +# without an additional write while others may have shutdown but do not
> > +# necessarily reflect read-only state in the mount options. Check both here to
> > +# cover the various scenarios.
> > +is_shutdown_or_ro()
> > +{
> > +	ro=0
> > +
> > +	# if the fs has not shutdown, this may help trigger a remount-ro
> > +	touch $SCRATCH_MNT/newfile 2>&1 | \
> > +		grep "Read-only file system" > /dev/null
> > +	[ $? == 0 ] && ro=1
> > +
> > +	_fs_options /dev/mapper/$vgname-$snapname | grep -w "ro" > /dev/null
> > +	[ $? == 0 ] && ro=1
> > +
> > +	echo $ro
> > +}
> > +
> >  # Ensure we have enough disk space
> >  _scratch_mkfs_sized $((350 * 1024 * 1024)) >>$seqres.full 2>&1
> >  
> > @@ -113,13 +133,9 @@ ret=$?
> >  #	- The filesystem stays in Read-Write mode, but can be frozen/thawed
> >  #	  without getting stuck.
> >  if [ $ret -ne 0 ]; then
> > -	# freeze failed, filesystem should reject further writes and remount
> > -	# as readonly. Sometimes the previous write process won't trigger
> > -	# ro-remount, e.g. on ext3/4, do additional touch here to make sure
> > -	# filesystems see the metadata I/O error.
> > -	touch $SCRATCH_MNT/newfile >/dev/null 2>&1
> > -	ISRO=$(_fs_options /dev/mapper/$vgname-$snapname | grep -w "ro")
> > -	if [ -n "$ISRO" ]; then
> > +	# freeze failed, filesystem should reject further writes
> > +	ISRO=`is_shutdown_or_ro`
> > +	if [ $ISRO == 1 ]; then
> >  		echo "Test OK"
> >  	else
> >  		echo "Freeze failed and FS isn't Read-Only. Test Failed"
> > -- 
> > 2.41.0
> > 
> > 
> 





[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux