Re: [RFC PATCH] xfs: support shrinking unused space in the last AG

Gao Xiang <hsiangkao@xxxxxxxxxx> · Wed, 21 Oct 2020 22:22:30 +0800

On Wed, Oct 21, 2020 at 09:21:08PM +0800, Gao Xiang wrote:
> On Wed, Oct 21, 2020 at 05:55:19AM -0400, Brian Foster wrote:
...
> > > > > > 
> > > > > > Interesting... this seems fundamentally sane when narrowing the scope
> > > > > > down to tail AG shrinking. Does xfs_repair flag any issues in the simple
> > > > > > tail AG shrink case?
> > > > > 
> > > > > Yeah, I ran xfs_repair together as well, For smaller sizes, it seems
> > > > > all fine, but I did observe some failure when much larger values
> > > > > passed in, so as a formal patch, it really needs to be solved later.
> > > > > 
> > > > 
> > > > I'm curious to see what xfs_repair complained about if you have a record
> > > > of it. That might call out some other things we could be overlooking.
> > > 
> > > Sorry for somewhat slow progress...
> > > 
> > > it could show random "SB summary counter sanity check failed" runtime message
> > > when the shrink size is large (much close to ag start).
> > > 
> > 
> > Ok. That error looks associated with a few different checks:
> > 
> >         if (XFS_BUF_ADDR(bp) == XFS_SB_DADDR && !sbp->sb_inprogress &&
> >             (sbp->sb_fdblocks > sbp->sb_dblocks ||
> >              !xfs_verify_icount(mp, sbp->sb_icount) ||
> >              sbp->sb_ifree > sbp->sb_icount)) {
> >                 xfs_warn(mp, "SB summary counter sanity check failed");
> >                 return -EFSCORRUPTED;
> >         }
> > 
> > Though I think the inode counters should be a subset of allocated space
> > (i.e. inode chunks) so are unlikely to be impacted by a removal of free
> > space. Without looking into details, I'd guess it's most likely just an
> > accounting bug and it's easiest to dump the relevant values that land in
> > the superblock and work backwards from there. FWIW, the followon
> > shutdown, repair (dirty log) and log recovery behavior (write and read
> > verifier failures) are typical and to be expected on metadata
> > corruption. IOW, I suspect that if we address the write verifier
> > failure, the followon issues will likely be resolved as well.
> 
> After looking into a little bit, the exact failure condition is
> sbp->sb_fdblocks > sbp->sb_dblocks,
> 
> and it seems sbp->sb_fdblocks doesn't decrease as expected when the shrink
> size is large (in fact, it's still the number as the origin compared with
> correct small shrink size) I'm still looking into what's exactly happening.
>

Update: the following incremental patch can fix the issue, yet I'm not sure
if it's the correct way or not...

Thanks,
Gao Xiang

diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 80927d323939..0a395901bc3f 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -149,6 +149,14 @@ xfs_growfs_data_private(
                                 nb - mp->m_sb.sb_dblocks);
        if (id.nfree)
                xfs_trans_mod_sb(tp, XFS_TRANS_SB_FDBLOCKS, id.nfree);
+
+       /*
+        * update in-core counters (especially sb_fdblocks) now
+        * so xfs_validate_sb_write() can pass.
+        */
+       if (xfs_sb_version_haslazysbcount(&mp->m_sb))
+               xfs_log_sb(tp);
+
        xfs_trans_set_sync(tp);
        error = xfs_trans_commit(tp);
        if (error)