On Thu, May 12, 2022 at 12:13:29PM -0700, Darrick J. Wong wrote: > On Thu, May 12, 2022 at 02:02:33PM -0500, Eric Sandeen wrote: > > On 5/5/22 11:05 AM, Darrick J. Wong wrote: > > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > > > Make sure we detect and correct mismatches between the V5 features > > > described in the primary and the secondary superblocks. > > > > > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > > --- > > > > > > > + if ((mp->m_sb.sb_features_incompat ^ sb->sb_features_incompat) & > > > + ~XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR) { > > > > I'd like to add a comment about why XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR is special. > > (Why is XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR special? Just because userspace doesn't > > bother to set it on all superblocks in the upgrade paths, right?) > > Right -- we only set it on the primary super to force users to run > xfs_repair. Repair will clear NEEDSREPAIR on the primary and all > secondaries before it exits Oh no it doesn't! :) I just debugged the persistent xfs/158 failure down to repair only writing the secondary superblocks with "features_changed" is true. xfs/158 trashes the repair process after setting the inobtcnt and needrepair feature bits in the primary superblock. That's the only code that sets the internal "features_changed" flag that triggers secondary superblock writeback. If repair then dies after this it won't get set on the next run without the upgrade options set on the command line. Hence we re-run repair without the upgrade feature being enabled to check that it still recovers correctly. The result is that repair does the right thing in completing the feature upgrade because it sees the feature flag in the primary superblock, but it *never sets "features_changed"* and so the secondary superblocks are not updated when it is done. It then goes on to check NEEDSREPAIR in the primary superblock and clears it, allowing the fileystem to mount again. This results in secondary superblocks with in-progress set and mis-matched feature bits, resulting in xfs_scrub reporting corrupt secondary superblocks and so failing the test. This change will probably mask that specific bug - it'll still be lying their waiting to pounce on some unsuspecting bystander, but it will be masked by other code detecting the feature mismatch that it causes and correcting it that way... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx