Re: [PATCH 2/2] xfs: Bypass sb alignment checks when custom values are used

Carlos Maiolino <cmaiolino@xxxxxxxxxx> · Tue, 2 Jun 2020 11:18:44 +0200

Hi Dave.

On Tue, Jun 02, 2020 at 07:21:15AM +1000, Dave Chinner wrote:
> On Mon, Jun 01, 2020 at 04:01:53PM +0200, Carlos Maiolino wrote:
> > index 4df87546bd40..72dae95a5e4a 100644
> > --- a/fs/xfs/libxfs/xfs_sb.c
> > +++ b/fs/xfs/libxfs/xfs_sb.c
> > @@ -360,19 +360,27 @@ xfs_validate_sb_common(
> >  		}
> >  	}
> >  
> > -	if (sbp->sb_unit) {
> > -		if (!xfs_sb_version_hasdalign(sbp) ||
> > -		    sbp->sb_unit > sbp->sb_width ||
> > -		    (sbp->sb_width % sbp->sb_unit) != 0) {
> > -			xfs_notice(mp, "SB stripe unit sanity check failed");
> > +	/*
> > +	 * Ignore superblock alignment checks if sunit/swidth mount options
> > +	 * were used or alignment turned off.
> > +	 * The custom alignment validation will happen later on xfs_mountfs()
> > +	 */
> > +	if (!(mp->m_flags & XFS_MOUNT_ALIGN) &&
> > +	    !(mp->m_flags & XFS_MOUNT_NOALIGN)) {
> 
> mp->m_dalign tells us at this point if a user specified sunit as a
> mount option.  That's how xfs_fc_validate_params() determines the user
> specified a custom sunit, so there is no need for a new mount flag
> here to indicate that mp->m_dalign was set by the user....

At a first glance, I thought about it too, but, there is nothing preventing an
user to mount a filesystem passing sunit=0,swidth=0. So, this means we can't
really rely on the m_dalign/m_swidth values to check an user passed in (or not)
alignment values. Unless we first deny users to pass 0 values into it.

> 
> Also, I think if the user specifies "NOALIGN" then we should still
> check the sunit/swidth and issue a warning that they are
> bad/invalid, or at least indicate in some way that the superblock is
> unhealthy and needs attention. Using mount options to sweep issues
> that need fixing under the carpet is less than ideal...
> 
> Also, I see nothing that turns off XFS_MOUNT_ALIGN when the custom
> alignment is written to the superblock and becomes the new on-disk
> values. Once we have those values in the in-core superblock, the
> write of the superblock should run the verifier to validate them.
> i.e. leaving this XFS_MOUNT_ALIGN set allows fields of the
> superblock we just modified to be written to disk without verifier
> validation.

I didn't think about it, thanks for the heads up.

> 
> From that last perspective, I _really_ don't like the idea of
> having user controlled conditional validation like this in the
> common verifier.
> 
> From a user perspective, I think this "use mount options to override
> bad values" approach is really nasty. How do you fix a system that
> won't boot because the root filesystem has bad sunit/swidth values?
> Telling the data center admin that they have to go boot every
> machine in their data center into a rescue distro after an automated
> upgrade triggered widespread boot failures is really not very user
> or admin friendly.
> 
> IMO, this bad sunit/swidth condition should be:
> 
> 	a) detected automatically at mount time,
> 	b) corrected automatically at mount time, and
> 	c) always verified to be valid at superblock write time.
> 
> IOWs, instead of failing to mount because sunit/swidth is invalid,
> we issue a warning and automatically correct it to something valid.
> There is precedence for this - we've done it with the AGFL free list
> format screwups and for journal structures that are different shapes
> on different platforms.

Eh, that was one of the options I considered, and also pointed by Eric when we
talked about it previously. At the end, I thought automatically modifying it
under the hoods was too invasive in regards of changing geometry configuration
without user interaction. Foolish me :P

> 
> Hence we need to move this verification out of the common sb
> verifier and move it into the write verifier (i.e. write is always
> verified).  Then in the mount path where we set user specified mount
> options, we should extent that to validate the existing on-disk
> values and then modify them if they are invalid. Rules for fixing
> are simple:
> 
> 	1. if !hasdalign(sb), set both sunit/swidth to zero.
> 	2. If sunit is zero, zero swidth.
> 	1. If swidth is not valid, round it up it to the nearest
> 	   integer multiple of sunit.
> 
> The user was not responsible for this mess (combination of missing
> validation in XFS code and bad storage firmware providing garbage)
> so we should not put them on the hook for fixing it. We can do it
> easily and without needing user intervention and so that's what we
> should do.
> 

Thanks for the insights, I'll work on that direction.

Cheers

-- 
Carlos