Re: [PATCH] xfs_repair: don't flag log_incompat inconsistencies as corruptions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/24/22 2:48 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@xxxxxxxxxx>
> 
> While testing xfs/233 and xfs/127 with LARP mode enabled, I noticed
> errors such as the following:
> 
> xfs_growfs --BlockSize=4096 --Blocks=8192
> data blocks changed from 8192 to 2579968
> meta-data=/dev/sdf               isize=512    agcount=630, agsize=4096 blks
>          =                       sectsz=512   attr=2, projid32bit=1
>          =                       crc=1        finobt=1, sparse=1, rmapbt=1
>          =                       reflink=1    bigtime=1 inobtcount=1
> data     =                       bsize=4096   blocks=2579968, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
> log      =internal log           bsize=4096   blocks=3075, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> _check_xfs_filesystem: filesystem on /dev/sdf is inconsistent (r)
> *** xfs_repair -n output ***
> Phase 1 - find and verify superblock...
>         - reporting progress in intervals of 15 minutes
> Phase 2 - using internal log
>         - zero log...
>         - 23:03:47: zeroing log - 3075 of 3075 blocks done
>         - scan filesystem freespace and inode maps...
> would fix log incompat feature mismatch in AG 30 super, 0x0 != 0x1
> would fix log incompat feature mismatch in AG 8 super, 0x0 != 0x1
> would fix log incompat feature mismatch in AG 12 super, 0x0 != 0x1
> would fix log incompat feature mismatch in AG 24 super, 0x0 != 0x1
> would fix log incompat feature mismatch in AG 18 super, 0x0 != 0x1
> <snip>
> 
> 0x1 corresponds to XFS_SB_FEAT_INCOMPAT_LOG_XATTRS, which is the feature
> bit used to indicate that the log contains extended attribute log intent
> items.  This is a mechanism to prevent older kernels from trying to
> recover log items that they won't know how to recover.
> 
> I thought about this a little bit more, and realized that log_incompat
> features bits are set on the primary sb prior to writing certain types
> of log records, and cleared once the log has written the committed
> changes back to the filesystem.  If the secondary superblocks are
> updated at any point during that interval (due to things like growfs or
> setting labels), the log_incompat field will now be set on the secondary
> supers.
> 
> Due to the ephemeral nature of the current log_incompat feature bits,
> a discrepancy between the primary and secondary supers is not a
> corruption.  If we're in dry run mode, we should log the discrepancy,
> but that's not a reason to end with EXIT_FAILURE.

Interesting. This makes me wonder a few things.

This approach differs from the just-added handling of 
XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR, where we /always/ ignore it. For now I think
that's a little different, because that flag only gets set from userspace, but
that could change in the future, maybe?

So I wonder why we have this feature getting noted and cleared, but the other
one always ignored.

I also notice that scrub tries to avoid setting it in the first place:

         * Don't write out a secondary super with NEEDSREPAIR or log incompat
         * features set, since both are ignored when set on a secondary.

... should growfs avoid it as well?

It feels like we're spreading this special handling around, copying (or not)
and ignoring (or not) at various points.  I kinda want to step back and think
about this a little.

It seems like the most consistent approach would be to always keep all supers
in sync, though I suppose that has costs. The 2nd most consistent approach would
be to never copy these ephemeral features to the secondary.

Whatever the consistent future looks like, I guess we do have to deal with
inconsistent stuff in the wild, already.

Thoughts?

-Eric


> 
> Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> ---
>  repair/agheader.c |   15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/repair/agheader.c b/repair/agheader.c
> index 2c2a26d1..478ed7e5 100644
> --- a/repair/agheader.c
> +++ b/repair/agheader.c
> @@ -286,15 +286,24 @@ check_v5_feature_mismatch(
>  		}
>  	}
>  
> +	/*
> +	 * Log incompat feature bits are set and cleared from the primary super
> +	 * as needed to protect against log replay on old kernels finding log
> +	 * records that they cannot handle.  Secondary sb resyncs performed as
> +	 * part of a geometry update to the primary sb (e.g. growfs, label/uuid
> +	 * changes) will copy the log incompat feature bits, but it's not a
> +	 * corruption for a secondary to have a bit set that is clear in the
> +	 * primary super.
> +	 */
>  	if (mp->m_sb.sb_features_log_incompat != sb->sb_features_log_incompat) {
>  		if (no_modify) {
> -			do_warn(
> -	_("would fix log incompat feature mismatch in AG %u super, 0x%x != 0x%x\n"),
> +			do_log(
> +	_("would sync log incompat feature in AG %u super, 0x%x != 0x%x\n"),
>  					agno, mp->m_sb.sb_features_log_incompat,
>  					sb->sb_features_log_incompat);
>  		} else {
>  			do_warn(
> -	_("will fix log incompat feature mismatch in AG %u super, 0x%x != 0x%x\n"),
> +	_("will sync log incompat feature in AG %u super, 0x%x != 0x%x\n"),
>  					agno, mp->m_sb.sb_features_log_incompat,
>  					sb->sb_features_log_incompat);
>  			dirty = true;
> 




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux