On Mon, Mar 16, 2020 at 09:07:07PM +0800, Zheng Bin wrote: > Use fuzz(hydra) to test XFS and automatically generate > tmp.img(XFS v5 format, but some metadata is wrong) > > xfs_repair information(just one AG): > agf_freeblks 0, counted 3224 in ag 0 > agf_longest 0, counted 3224 in ag 0 > sb_fdblocks 3228, counted 3224 > > Test as follows: > mount tmp.img tmpdir > cp file1M tmpdir > sync > > In 4.19-stable, sync will stuck, while in linux-next, sync not stuck. > The reason is same to commit d0c7feaf8767 > ("xfs: add agf freeblocks verify in xfs_agf_verify"), cause agf_longest > is 0, we can not block this in xfs_agf_verify. Uh.... what are you saying here? That the allocator misbehaves and loops forever if sb_fdblocks > sum(agf_freeblks) after mount? Also, uh, what do you mean by "sync not stuck"? Writeback will fail on allocation error, right...? So I think the problem with incorrect AGF contents (on upstream) is that writeback will fail due to ENOSPC, which should never happen under normal circumstance? > > Make sure fdblocks is always inited in mount(also init ifree, icount). > > xfs_mountfs > xfs_check_summary_counts > xfs_initialize_perag_data > > Signed-off-by: Zheng Bin <zhengbin13@xxxxxxxxxx> > --- > fs/xfs/xfs_mount.c | 33 --------------------------------- > 1 file changed, 33 deletions(-) > > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c > index c5513e5..dc41801 100644 > --- a/fs/xfs/xfs_mount.c > +++ b/fs/xfs/xfs_mount.c > @@ -594,39 +594,6 @@ xfs_check_summary_counts( > return -EFSCORRUPTED; > } > > - /* > - * Now the log is mounted, we know if it was an unclean shutdown or > - * not. If it was, with the first phase of recovery has completed, we > - * have consistent AG blocks on disk. We have not recovered EFIs yet, > - * but they are recovered transactionally in the second recovery phase > - * later. > - * > - * If the log was clean when we mounted, we can check the summary > - * counters. If any of them are obviously incorrect, we can recompute > - * them from the AGF headers in the next step. > - */ > - if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) && > - (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks || > - !xfs_verify_icount(mp, mp->m_sb.sb_icount) || > - mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) > - xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS); > - > - /* > - * We can safely re-initialise incore superblock counters from the > - * per-ag data. These may not be correct if the filesystem was not > - * cleanly unmounted, so we waited for recovery to finish before doing > - * this. > - * > - * If the filesystem was cleanly unmounted or the previous check did > - * not flag anything weird, then we can trust the values in the > - * superblock to be correct and we don't need to do anything here. > - * Otherwise, recalculate the summary counters. > - */ > - if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) || > - XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) && > - !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS)) > - return 0; > - > return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount); The downside of this is that now we /always/ have to make two trips around all of the AGs at mount time. If you're proposing to require a fresh fdblocks recomputation at mount, could you please refactor all of the mount-time AG walks into a single loop? And perhaps use xfs_pwork so that we don't have to do it serially? --D > } > > -- > 2.7.4 >