Hi Dave. On Tue, Jun 02, 2020 at 07:21:15AM +1000, Dave Chinner wrote: > On Mon, Jun 01, 2020 at 04:01:53PM +0200, Carlos Maiolino wrote: > > index 4df87546bd40..72dae95a5e4a 100644 > > --- a/fs/xfs/libxfs/xfs_sb.c > > +++ b/fs/xfs/libxfs/xfs_sb.c > > @@ -360,19 +360,27 @@ xfs_validate_sb_common( > > } > > } > > > > - if (sbp->sb_unit) { > > - if (!xfs_sb_version_hasdalign(sbp) || > > - sbp->sb_unit > sbp->sb_width || > > - (sbp->sb_width % sbp->sb_unit) != 0) { > > - xfs_notice(mp, "SB stripe unit sanity check failed"); > > + /* > > + * Ignore superblock alignment checks if sunit/swidth mount options > > + * were used or alignment turned off. > > + * The custom alignment validation will happen later on xfs_mountfs() > > + */ > > + if (!(mp->m_flags & XFS_MOUNT_ALIGN) && > > + !(mp->m_flags & XFS_MOUNT_NOALIGN)) { > > mp->m_dalign tells us at this point if a user specified sunit as a > mount option. That's how xfs_fc_validate_params() determines the user > specified a custom sunit, so there is no need for a new mount flag > here to indicate that mp->m_dalign was set by the user.... At a first glance, I thought about it too, but, there is nothing preventing an user to mount a filesystem passing sunit=0,swidth=0. So, this means we can't really rely on the m_dalign/m_swidth values to check an user passed in (or not) alignment values. Unless we first deny users to pass 0 values into it. > > Also, I think if the user specifies "NOALIGN" then we should still > check the sunit/swidth and issue a warning that they are > bad/invalid, or at least indicate in some way that the superblock is > unhealthy and needs attention. Using mount options to sweep issues > that need fixing under the carpet is less than ideal... > > Also, I see nothing that turns off XFS_MOUNT_ALIGN when the custom > alignment is written to the superblock and becomes the new on-disk > values. Once we have those values in the in-core superblock, the > write of the superblock should run the verifier to validate them. > i.e. leaving this XFS_MOUNT_ALIGN set allows fields of the > superblock we just modified to be written to disk without verifier > validation. I didn't think about it, thanks for the heads up. > > From that last perspective, I _really_ don't like the idea of > having user controlled conditional validation like this in the > common verifier. > > From a user perspective, I think this "use mount options to override > bad values" approach is really nasty. How do you fix a system that > won't boot because the root filesystem has bad sunit/swidth values? > Telling the data center admin that they have to go boot every > machine in their data center into a rescue distro after an automated > upgrade triggered widespread boot failures is really not very user > or admin friendly. > > IMO, this bad sunit/swidth condition should be: > > a) detected automatically at mount time, > b) corrected automatically at mount time, and > c) always verified to be valid at superblock write time. > > IOWs, instead of failing to mount because sunit/swidth is invalid, > we issue a warning and automatically correct it to something valid. > There is precedence for this - we've done it with the AGFL free list > format screwups and for journal structures that are different shapes > on different platforms. Eh, that was one of the options I considered, and also pointed by Eric when we talked about it previously. At the end, I thought automatically modifying it under the hoods was too invasive in regards of changing geometry configuration without user interaction. Foolish me :P > > Hence we need to move this verification out of the common sb > verifier and move it into the write verifier (i.e. write is always > verified). Then in the mount path where we set user specified mount > options, we should extent that to validate the existing on-disk > values and then modify them if they are invalid. Rules for fixing > are simple: > > 1. if !hasdalign(sb), set both sunit/swidth to zero. > 2. If sunit is zero, zero swidth. > 1. If swidth is not valid, round it up it to the nearest > integer multiple of sunit. > > The user was not responsible for this mess (combination of missing > validation in XFS code and bad storage firmware providing garbage) > so we should not put them on the hook for fixing it. We can do it > easily and without needing user intervention and so that's what we > should do. > Thanks for the insights, I'll work on that direction. Cheers -- Carlos