Re: [PATCH v1] xfs: fix sb write verify for lazysbcount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Oct 22, 2022 at 10:03:45AM +0800, Long Li wrote:
> When lazysbcount is enabled, multiple threads stress test the xfs report
> the following problems:
> 
> XFS (loop0): SB summary counter sanity check failed
> XFS (loop0): Metadata corruption detected at xfs_sb_write_verify
> 	     +0x13b/0x460, xfs_sb block 0x0
> XFS (loop0): Unmount and run xfs_repair
> XFS (loop0): First 128 bytes of corrupted metadata buffer:
> 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 28 00 00  XFSB.........(..
> 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000020: 69 fb 7c cd 5f dc 44 af 85 74 e0 cc d4 e3 34 5a  i.|._.D..t....4Z
> 00000030: 00 00 00 00 00 20 00 06 00 00 00 00 00 00 00 80  ..... ..........
> 00000040: 00 00 00 00 00 00 00 81 00 00 00 00 00 00 00 82  ................
> 00000050: 00 00 00 01 00 0a 00 00 00 00 00 04 00 00 00 00  ................
> 00000060: 00 00 0a 00 b4 b5 02 00 02 00 00 08 00 00 00 00  ................
> 00000070: 00 00 00 00 00 00 00 00 0c 09 09 03 14 00 00 19  ................
> XFS (loop0): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply
> 	+0xe1e/0x10e0 (fs/xfs/xfs_buf.c:1580).  Shutting down filesystem.
> XFS (loop0): Please unmount the filesystem and rectify the problem(s)
> XFS (loop0): log mount/recovery failed: error -117
> XFS (loop0): log mount failed
> 
> The cause of the problem is that during the log recovery process, incorrect
> icount and ifree are recovered from the log and fail to pass the size check

Are you saying that the log contained a transaction in which ifree >
icount?

> in xfs_validate_sb_write().
> 
> With lazysbcount is enabled, There is no additional lock protection for
> reading m_ifree and m_icount in xfs_log_sb(), if other threads modifies
> the m_ifree between the read m_icount and the m_ifree, this will make the
> m_ifree larger than m_icount and written to the log. If we have an unclean
> shutdown, this will be corrected by xfs_initialize_perag_data() rebuilding
> the counters from the AGF block counts, and the correction is later than
> log recovery. During log recovery, incorrect ifree/icount may be restored
> from the log and written to the super block, since ifree and icount have
> not been corrected at this time, the relationship between ifree and icount
> cannot be checked in xfs_validate_sb_write().
> 
> So, don't check the size between ifree and icount in xfs_validate_sb_write()
> when lazysbcount is enabled.

Um, doesn't that neuter a sanity check on all V5 filesystems?

--D

> Fixes: 8756a5af1819 ("libxfs: add more bounds checking to sb sanity checks")
> Signed-off-by: Long Li <leo.lilong@xxxxxxxxxx>
> ---
>  fs/xfs/libxfs/xfs_sb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> index a20cade590e9..b4a4e57361e7 100644
> --- a/fs/xfs/libxfs/xfs_sb.c
> +++ b/fs/xfs/libxfs/xfs_sb.c
> @@ -245,7 +245,7 @@ xfs_validate_sb_write(
>  	if (xfs_buf_daddr(bp) == XFS_SB_DADDR && !sbp->sb_inprogress &&
>  	    (sbp->sb_fdblocks > sbp->sb_dblocks ||
>  	     !xfs_verify_icount(mp, sbp->sb_icount) ||
> -	     sbp->sb_ifree > sbp->sb_icount)) {
> +	     (!xfs_has_lazysbcount(mp) && sbp->sb_ifree > sbp->sb_icount))) {
>  		xfs_warn(mp, "SB summary counter sanity check failed");
>  		return -EFSCORRUPTED;
>  	}
> -- 
> 2.31.1
> 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux