Re: [PATCH] xfs: fix debug-mode accounting for deferred AGFL freeing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 26, 2021 at 08:50:52AM +1000, Dave Chinner wrote:
> On Sun, Apr 25, 2021 at 08:46:34AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > 
> > In commit f8f2835a9cf3 we changed the behavior of XFS to use EFIs to
> > remove blocks from an overfilled AGFL because there were complaints
> > about transaction overruns that stemmed from trying to free multiple
> > blocks in a single transaction.
> > 
> > Unfortunately, that commit missed a subtlety in the debug-mode
> > transaction accounting when a realtime volume is attached.  If a
> > realtime file undergoes a data fork mapping change such that realtime
> > extents are allocated (or freed) in the same transaction that a data
> > device block is also allocated (or freed), we can trip a debugging
> > assertion.  This can happen (for example) if a realtime extent is
> > allocated and it is necessary to reshape the bmbt to hold the new
> > mapping.
> > 
> > When we go to allocate a bmbt block from an AG, the first thing the data
> > device block allocator does is ensure that the freelist is the proper
> > length.  If the freelist is too long, it will trim the freelist to the
> > proper length.
> > 
> > In debug mode, trimming the freelist calls xfs_trans_agflist_delta() to
> > record the decrement in the AG free list count.  Prior to f8f28 we would
> > put the free block back in the free space btrees in the same
> > transaction, which calls xfs_trans_agblocks_delta() to record the
> > increment in the AG free block count.  Since AGFL blocks are included in
> > the global free block count (fdblocks), there is no corresponding
> > fdblocks update, so the AGFL free satisfies the following condition in
> > xfs_trans_apply_sb_deltas:
> > 
> > 	/*
> > 	 * Check that superblock mods match the mods made to AGF counters.
> > 	 */
> > 	ASSERT((tp->t_fdblocks_delta + tp->t_res_fdblocks_delta) ==
> > 	       (tp->t_ag_freeblks_delta + tp->t_ag_flist_delta +
> > 		tp->t_ag_btree_delta));
> > 
> > The comparison here used to be: (X + 0) == ((X+1) + -1 + 0), where X is
> > the number blocks that were allocated.
> > 
> > After commit f8f28 we defer the block freeing to the next chained
> > transaction, which means that the calls to xfs_trans_agflist_delta and
> > xfs_trans_agblocks_delta occur in separate transactions.  The (first)
> > transaction that shortens the free list trips on the comparison, which
> > has now become:
> > 
> > (X + 0) == ((X) + -1 + 0)
> > 
> > because we haven't freed the AGFL block yet; we've only logged an
> > intention to free it.  When the second transaction (the deferred free)
> > commits, it will evaluate the expression as:
> > 
> > (0 + 0) == (1 + 0 + 0)
> > 
> > and trip over that in turn.
> > 
> > At this point, the astute reader may note that the two commits tagged by
> > this patch have been in the kernel for a long time but haven't generated
> > any bug reports.  How is it that the author became aware of this bug?
> > 
> > This originally surfaced as an intermittent failure when I was testing
> > realtime rmap, but a different bug report by Zorro Lang reveals the same
> > assertion occuring on !lazysbcount filesystems.
> > 
> > The common factor to both reports (and why this problem wasn't
> > previously reported) becomes apparent if we consider when
> > xfs_trans_apply_sb_deltas is called by __xfs_trans_commit():
> > 
> > 	if (tp->t_flags & XFS_TRANS_SB_DIRTY)
> > 		xfs_trans_apply_sb_deltas(tp);
> 
> IIUC, what you are saying is that this debug code is simply not
> exercised in normal testing and hasn't been for the past decade?

Yes.

> And it still won't be exercised on anything other than realtime
> device testing?

Yes.  Note that I've added it to my regular testing routine since I
actually /do/ have users of rt volumes now.

> I don't really consider !lazycount worth investing time and
> effort into because deprecated and default for over a decade, so if
> this is the case, then why not just remove it?

I think I understand [this infrequently exercised code path], which
means I don't understand it, so rather than be one of those idiots who
rips out everything he doesn't understand, I thought I'd at least try to
figure out what it did and patch it back together.

So now that I actually understand what that debug code does having dug
around to see what those fields do, I'd also be just fine ripping it
out.  But I wanted to know what I would be tearing out first...

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux