On 2020/3/17 11:55, Darrick J. Wong wrote: > On Tue, Mar 17, 2020 at 10:22:33AM +0800, zhengbin (A) wrote: >> On 2020/3/16 23:13, Darrick J. Wong wrote: >>> On Mon, Mar 16, 2020 at 09:07:07PM +0800, Zheng Bin wrote: >>>> Use fuzz(hydra) to test XFS and automatically generate >>>> tmp.img(XFS v5 format, but some metadata is wrong) >>>> >>>> xfs_repair information(just one AG): >>>> agf_freeblks 0, counted 3224 in ag 0 >>>> agf_longest 0, counted 3224 in ag 0 >>>> sb_fdblocks 3228, counted 3224 >>>> >>>> Test as follows: >>>> mount tmp.img tmpdir >>>> cp file1M tmpdir >>>> sync >>>> >>>> In 4.19-stable, sync will stuck, while in linux-next, sync not stuck. >>>> The reason is same to commit d0c7feaf8767 >>>> ("xfs: add agf freeblocks verify in xfs_agf_verify"), cause agf_longest >>>> is 0, we can not block this in xfs_agf_verify. >>> Uh.... what are you saying here? That the allocator misbehaves and >>> loops forever if sb_fdblocks > sum(agf_freeblks) after mount? >>> >>> Also, uh, what do you mean by "sync not stuck"? Writeback will fail on >>> allocation error, right...? So I think the problem with incorrect AGF >>> contents (on upstream) is that writeback will fail due to ENOSPC, which >>> should never happen under normal circumstance? >>> >>>> Make sure fdblocks is always inited in mount(also init ifree, icount). >>>> >>>> xfs_mountfs >>>> xfs_check_summary_counts >>>> xfs_initialize_perag_data >>>> >>>> Signed-off-by: Zheng Bin <zhengbin13@xxxxxxxxxx> >>>> --- >>>> fs/xfs/xfs_mount.c | 33 --------------------------------- >>>> 1 file changed, 33 deletions(-) >>>> >>>> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c >>>> index c5513e5..dc41801 100644 >>>> --- a/fs/xfs/xfs_mount.c >>>> +++ b/fs/xfs/xfs_mount.c >>>> @@ -594,39 +594,6 @@ xfs_check_summary_counts( >>>> return -EFSCORRUPTED; >>>> } >>>> >>>> - /* >>>> - * Now the log is mounted, we know if it was an unclean shutdown or >>>> - * not. If it was, with the first phase of recovery has completed, we >>>> - * have consistent AG blocks on disk. We have not recovered EFIs yet, >>>> - * but they are recovered transactionally in the second recovery phase >>>> - * later. >>>> - * >>>> - * If the log was clean when we mounted, we can check the summary >>>> - * counters. If any of them are obviously incorrect, we can recompute >>>> - * them from the AGF headers in the next step. >>>> - */ >>>> - if (XFS_LAST_UNMOUNT_WAS_CLEAN(mp) && >>>> - (mp->m_sb.sb_fdblocks > mp->m_sb.sb_dblocks || >>>> - !xfs_verify_icount(mp, mp->m_sb.sb_icount) || >>>> - mp->m_sb.sb_ifree > mp->m_sb.sb_icount)) >>>> - xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS); >>>> - >>>> - /* >>>> - * We can safely re-initialise incore superblock counters from the >>>> - * per-ag data. These may not be correct if the filesystem was not >>>> - * cleanly unmounted, so we waited for recovery to finish before doing >>>> - * this. >>>> - * >>>> - * If the filesystem was cleanly unmounted or the previous check did >>>> - * not flag anything weird, then we can trust the values in the >>>> - * superblock to be correct and we don't need to do anything here. >>>> - * Otherwise, recalculate the summary counters. >>>> - */ >>>> - if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) || >>>> - XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) && >>>> - !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS)) >>>> - return 0; >>>> - >>>> return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount); >>> The downside of this is that now we /always/ have to make two trips >>> around all of the AGs at mount time. If you're proposing to require a >>> fresh fdblocks recomputation at mount, could you please refactor all of >>> the mount-time AG walks into a single loop? And perhaps use xfs_pwork >>> so that we don't have to do it serially? >> xfs_mountfs >> xfs_initialize_perag -->just alloc memory >> xfs_check_summary_counts >> xfs_initialize_perag_data -->read agf,agi from disk >> for (index = 0; index < agcount; index++) >> error = xfs_alloc_pagf_init(mp, NULL, index, 0) >> error = xfs_ialloc_pagi_init(mp, NULL, index) >> xfs_fs_reserve_ag_blocks >> for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) >> xfs_ag_resv_init >> #ifdef DEBUG -->read agf >> error = xfs_alloc_pagf_init(pag->pag_mount, tp, pag->pag_agno, 0) >> #endif >> >> Is this? if enable XFS_DEBUG, xfs_initialize_perag_data read agf, xfs_ag_resv_init >> >> also read agf? > Yes, that's the other AG-walker that I was referring to. Maybe we can judge if (!pag->pagf_init) in xfs_ag_resv_init? xfs_fs_reserve_ag_blocks for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) xfs_ag_resv_init #ifdef DEBUG -->read agf if (!pag->pagf_init) { error = xfs_alloc_pagf_init(pag->pag_mount, tp, pag->pag_agno, 0) } > > --D > >>> --D >>> >>>> } >>>> >>>> -- >>>> 2.7.4 >>>> >>> . >>> > . >