Re: [PATCH] xfs: don't use in-core per-cpu fdblocks for !lazysbcount

"Darrick J. Wong" <djwong@xxxxxxxxxx> · Fri, 16 Apr 2021 09:00:13 -0700

On Fri, Apr 16, 2021 at 05:10:23PM +0800, Gao Xiang wrote:
> There are many paths which could trigger xfs_log_sb(), e.g.
>   xfs_bmap_add_attrfork()
>     -> xfs_log_sb()
> , which overrided on-disk fdblocks by in-core per-CPU fdblocks.
> 
> However, for !lazysbcount cases, on-disk fdblocks is actually updated
> by xfs_trans_apply_sb_deltas(), and generally it isn't equal to
> in-core fdblocks due to xfs_reserve_block() or whatever, see the
> comment in xfs_unmountfs().
> 
> It could be observed by the following steps reported by Zorro [1]:
> 
> 1. mkfs.xfs -f -l lazy-count=0 -m crc=0 $dev
> 2. mount $dev $mnt
> 3. fsstress -d $mnt -p 100 -n 1000 (maybe need more or less io load)
> 4. umount $mnt
> 5. xfs_repair -n $dev
> 
> yet due to commit f46e5a174655("xfs: fold sbcount quiesce logging
> into log covering"), xfs_sync_sb() will be triggered even !lazysbcount
> but xfs_log_need_covered() case when xfs_unmountfs(), so hard to
> reproduce on kernel 5.12+.

Um, I can't understand this(?), possibly because I can't get to RHBZ and
therefore have very little context to start from. :(

Are you saying that because the f46e commit removed the xfs_sync_sb
calls from unmountfs for !lazysb filesystems, we no longer log the
summary counters at unmount?  Which means that we no longer write the
incore percpu fdblocks count to disk at unmount after we've torn down
all the incore space reservations (when sb_fdblocks == m_fdblocks)?

So that means that for !lazysb fses, the only time we log the sb
counters is during transactions, and when we do log the counters we
actually log the wrong value, since the incore reservations should never
escape to disk?  Hence the fix below?

And then by extension, is the reason that nobody noticed before is that
we always used to log the correct value at unmount, so fses with clean
logs always have the correct value, and fses with dirty logs will
recompute fdblocks after log recovery by summing the AGF free blocks
counts?

(Or possibly nobody uses !lazysb filesystems anymore?)

I /think/ the code change looks ok, but as you might surmise from the
large quantity of questions, I'm not ready to RVB this yet.  The commit
message seems like a good place to answer those questions.

> After this patch, I've seen no strange so far on older kernels
> for the testcase above without lazysbcount.
> 
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1949515

This strangely <cough> doesn't seem to be accessible to the public at
large, since <cough> someone at RedHat decided to block all Oracle IPs
<cough>.

--D

> 
> Reported-by: Zorro Lang <zlang@xxxxxxxxxx>
> Signed-off-by: Gao Xiang <hsiangkao@xxxxxxxxxx>
> ---
>  fs/xfs/libxfs/xfs_sb.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> index 60e6d255e5e2..423dada3f64c 100644
> --- a/fs/xfs/libxfs/xfs_sb.c
> +++ b/fs/xfs/libxfs/xfs_sb.c
> @@ -928,7 +928,13 @@ xfs_log_sb(
>  
>  	mp->m_sb.sb_icount = percpu_counter_sum(&mp->m_icount);
>  	mp->m_sb.sb_ifree = percpu_counter_sum(&mp->m_ifree);
> -	mp->m_sb.sb_fdblocks = percpu_counter_sum(&mp->m_fdblocks);
> +	if (!xfs_sb_version_haslazysbcount(&mp->m_sb)) {
> +		struct xfs_dsb	*dsb = bp->b_addr;
> +
> +		mp->m_sb.sb_fdblocks = be64_to_cpu(dsb->sb_fdblocks);
> +	} else {
> +		mp->m_sb.sb_fdblocks = percpu_counter_sum(&mp->m_fdblocks);
> +	}
>  
>  	xfs_sb_to_disk(bp->b_addr, &mp->m_sb);
>  	xfs_trans_buf_set_type(tp, bp, XFS_BLFT_SB_BUF);
> -- 
> 2.27.0
>