On Sat, Oct 22, 2022 at 10:03:45AM +0800, Long Li wrote: > When lazysbcount is enabled, multiple threads stress test the xfs report > the following problems: > > XFS (loop0): SB summary counter sanity check failed > XFS (loop0): Metadata corruption detected at xfs_sb_write_verify > +0x13b/0x460, xfs_sb block 0x0 > XFS (loop0): Unmount and run xfs_repair > XFS (loop0): First 128 bytes of corrupted metadata buffer: > 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 28 00 00 XFSB.........(.. > 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 00000020: 69 fb 7c cd 5f dc 44 af 85 74 e0 cc d4 e3 34 5a i.|._.D..t....4Z > 00000030: 00 00 00 00 00 20 00 06 00 00 00 00 00 00 00 80 ..... .......... > 00000040: 00 00 00 00 00 00 00 81 00 00 00 00 00 00 00 82 ................ > 00000050: 00 00 00 01 00 0a 00 00 00 00 00 04 00 00 00 00 ................ > 00000060: 00 00 0a 00 b4 b5 02 00 02 00 00 08 00 00 00 00 ................ > 00000070: 00 00 00 00 00 00 00 00 0c 09 09 03 14 00 00 19 ................ > XFS (loop0): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply > +0xe1e/0x10e0 (fs/xfs/xfs_buf.c:1580). Shutting down filesystem. > XFS (loop0): Please unmount the filesystem and rectify the problem(s) > XFS (loop0): log mount/recovery failed: error -117 > XFS (loop0): log mount failed > > The cause of the problem is that during the log recovery process, incorrect > icount and ifree are recovered from the log and fail to pass the size check Are you saying that the log contained a transaction in which ifree > icount? > in xfs_validate_sb_write(). > > With lazysbcount is enabled, There is no additional lock protection for > reading m_ifree and m_icount in xfs_log_sb(), if other threads modifies > the m_ifree between the read m_icount and the m_ifree, this will make the > m_ifree larger than m_icount and written to the log. If we have an unclean > shutdown, this will be corrected by xfs_initialize_perag_data() rebuilding > the counters from the AGF block counts, and the correction is later than > log recovery. During log recovery, incorrect ifree/icount may be restored > from the log and written to the super block, since ifree and icount have > not been corrected at this time, the relationship between ifree and icount > cannot be checked in xfs_validate_sb_write(). > > So, don't check the size between ifree and icount in xfs_validate_sb_write() > when lazysbcount is enabled. Um, doesn't that neuter a sanity check on all V5 filesystems? --D > Fixes: 8756a5af1819 ("libxfs: add more bounds checking to sb sanity checks") > Signed-off-by: Long Li <leo.lilong@xxxxxxxxxx> > --- > fs/xfs/libxfs/xfs_sb.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c > index a20cade590e9..b4a4e57361e7 100644 > --- a/fs/xfs/libxfs/xfs_sb.c > +++ b/fs/xfs/libxfs/xfs_sb.c > @@ -245,7 +245,7 @@ xfs_validate_sb_write( > if (xfs_buf_daddr(bp) == XFS_SB_DADDR && !sbp->sb_inprogress && > (sbp->sb_fdblocks > sbp->sb_dblocks || > !xfs_verify_icount(mp, sbp->sb_icount) || > - sbp->sb_ifree > sbp->sb_icount)) { > + (!xfs_has_lazysbcount(mp) && sbp->sb_ifree > sbp->sb_icount))) { > xfs_warn(mp, "SB summary counter sanity check failed"); > return -EFSCORRUPTED; > } > -- > 2.31.1 >