Re: [PATCH 3/4] xfs: set the buffer type after holding the AG[IF] across trans_roll

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 14, 2022 at 09:25:53AM +1100, Dave Chinner wrote:
> On Sun, Oct 02, 2022 at 11:19:48AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > 
> > Currently, the only way to lock an allocation group is to hold the AGI
> > and AGF buffers.  If repair needs to roll the transaction while
> > repairing some AG metadata, it maintains that lock by holding the two
> > buffers across the transaction roll and joins them afterwards.
> > 
> > However, repair is not the same as the other parts of XFS that employ
> > this bhold/bjoin sequence, because it's possible that the AGI or AGF
> > buffers are not actually dirty before the roll.  In this case, the
> > buffer log item can detach from the buffer, which means that we have to
> 
> Doesn't this imply we have a reference counting problem with
> XFS_BLI_HOLD buffers? i.e. the bli can only get detached when the
> reference count on it goes to zero. If the buffer is clean and
> joined to a transaction, then that means the only reference to the
> BLI is the current transaction. Hence the only way it can get
> detached is for the transaction commit to release the current
> transaction's reference to the BLI.
> 
> Ah, XFS_BLI_HOLD does not take a reference to the BLI - it just
> prevents ->iop_release from releasing the -buffer- after it drops
> the transaction reference to the BLI. That's the problem right there
> - xfs_buf_item_release() drops the current trans ref to the clean
> item via xfs_buf_item_release() regardless of whether BLI_HOLD is
> set or not, hence freeing the BLI on clean buffers.
> 
> IOWs, it looks to me like XFS_BLI_HOLD should actually hold a
> reference to the BLI as well as the buffer so that we don't free the
> BLI for a held clean buffer in xfs_buf_item_release(). The reference
> we leave behind will then be picked up by the subsequent call to
> xfs_trans_bjoin() which finds the clean BLI already attached to the
> buffer...

<nod> I think you're saying that _xfs_trans_bjoin should:

	if (!(bip->bli_flags & XFS_BLI_HOLD))
		atomic_inc(&bip->bli_refcount);

and xfs_buf_item_release should do:

	if (hold)
		return;
	released = xfs_buf_item_put(bip);
	if (stale && !released)
		return;

I'll have to remember how I induced this error in the first place.  I
think it was when I was running repair tests with mem=600M.

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux