On 2020/3/16 23:13, Darrick J. Wong wrote: > On Mon, Mar 16, 2020 at 09:07:07PM +0800, Zheng Bin wrote: >> Use fuzz(hydra) to test XFS and automatically generate >> tmp.img(XFS v5 format, but some metadata is wrong) >> >> xfs_repair information(just one AG): >> agf_freeblks 0, counted 3224 in ag 0 >> agf_longest 0, counted 3224 in ag 0 >> sb_fdblocks 3228, counted 3224 >> >> Test as follows: >> mount tmp.img tmpdir >> cp file1M tmpdir >> sync >> >> In 4.19-stable, sync will stuck, while in linux-next, sync not stuck. >> The reason is same to commit d0c7feaf8767 >> ("xfs: add agf freeblocks verify in xfs_agf_verify"), cause agf_longest >> is 0, we can not block this in xfs_agf_verify. > Uh.... what are you saying here? That the allocator misbehaves and > loops forever if sb_fdblocks > sum(agf_freeblks) after mount? > > Also, uh, what do you mean by "sync not stuck"? Writeback will fail on > allocation error, right...? So I think the problem with incorrect AGF > contents (on upstream) is that writeback will fail due to ENOSPC, which > should never happen under normal circumstance? > >> Make sure fdblocks is always inited in mount(also init ifree, icount). >> >> xfs_mountfs >> xfs_check_summary_counts >> xfs_initialize_perag_data >> >> Signed-off-by: Zheng Bin <zhengbin13@xxxxxxxxxx> >> --- >> fs/xfs/xfs_mount.c | 33 --------------------------------- >> 1 file changed, 33 deletions(-) >> >> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c >> index c5513e5..dc41801 100644 >> --- a/fs/xfs/xfs_mount.c >> +++ b/fs/xfs/xfs_mount.c >> @@ -594,39 +594,6 @@ xfs_check_summary_counts( >> return -EFSCORRUPTED; >> } >> >> - /* >> - * Now the log is mounted, we know if it was an unclean shutdown or >> - * not. If it was, with the first phase of recovery has completed, we >> - * have consistent AG blocks on disk. We have not recovered EFIs yet, >> - * but they are recovered transactionally in the second recovery phase >> - * later. >> - * >> - * If the log was clean when we mounted, we can check the summary >> - * counters. If any of them are obviously incorrect, we can recompute >> - * them from the AGF headers in the next step. >> - */ >> - if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) && >> - (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks || >> - !xfs_verify_icount(mp, mp->m_sb.sb_icount) || >> - mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) >> - xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS); >> - >> - /* >> - * We can safely re-initialise incore superblock counters from the >> - * per-ag data. These may not be correct if the filesystem was not >> - * cleanly unmounted, so we waited for recovery to finish before doing >> - * this. >> - * >> - * If the filesystem was cleanly unmounted or the previous check did >> - * not flag anything weird, then we can trust the values in the >> - * superblock to be correct and we don't need to do anything here. >> - * Otherwise, recalculate the summary counters. >> - */ >> - if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) || >> - XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) && >> - !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS)) >> - return 0; >> - >> return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount); > The downside of this is that now we /always/ have to make two trips > around all of the AGs at mount time. If you're proposing to require a > fresh fdblocks recomputation at mount, could you please refactor all of > the mount-time AG walks into a single loop? And perhaps use xfs_pwork > so that we don't have to do it serially? xfs_mountfs xfs_initialize_perag -->just alloc memory xfs_check_summary_counts xfs_initialize_perag_data -->read agf,agi from disk for (index = 0; index < agcount; index++) error = xfs_alloc_pagf_init(mp, NULL, index, 0) error = xfs_ialloc_pagi_init(mp, NULL, index) xfs_fs_reserve_ag_blocks for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) xfs_ag_resv_init #ifdef DEBUG -->read agf error = xfs_alloc_pagf_init(pag->pag_mount, tp, pag->pag_agno, 0) #endif Is this? if enable XFS_DEBUG, xfs_initialize_perag_data read agf, xfs_ag_resv_init also read agf? > > --D > >> } >> >> -- >> 2.7.4 >> > . >