Re: [PATCH v24 02/11] xfs: Capture buffers for delayed ops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 31, 2021 at 10:50:52AM -0700, Allison Henderson wrote:
> 
> 
> On 8/30/21 10:44 AM, Darrick J. Wong wrote:
> > On Tue, Aug 24, 2021 at 03:44:25PM -0700, Allison Henderson wrote:
> > > This patch enables delayed operations to capture held buffers with in
> > > the xfs_defer_capture. Buffers are then rejoined to the new
> > > transaction in xlog_finish_defer_ops
> > > 
> > > Signed-off-by: Allison Henderson <allison.henderson@xxxxxxxxxx>
> > > ---
> > >   fs/xfs/libxfs/xfs_defer.c  | 7 ++++++-
> > >   fs/xfs/libxfs/xfs_defer.h  | 4 +++-
> > >   fs/xfs/xfs_bmap_item.c     | 2 +-
> > >   fs/xfs/xfs_buf.c           | 1 +
> > >   fs/xfs/xfs_buf.h           | 1 +
> > >   fs/xfs/xfs_extfree_item.c  | 2 +-
> > >   fs/xfs/xfs_log_recover.c   | 7 +++++++
> > >   fs/xfs/xfs_refcount_item.c | 2 +-
> > >   fs/xfs/xfs_rmap_item.c     | 2 +-
> > >   9 files changed, 22 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
> > > index eff4a127188e..d1d09b6aca55 100644
> > > --- a/fs/xfs/libxfs/xfs_defer.c
> > > +++ b/fs/xfs/libxfs/xfs_defer.c
> > > @@ -639,6 +639,7 @@ xfs_defer_ops_capture(
> > >   	dfc = kmem_zalloc(sizeof(*dfc), KM_NOFS);
> > >   	INIT_LIST_HEAD(&dfc->dfc_list);
> > >   	INIT_LIST_HEAD(&dfc->dfc_dfops);
> > > +	INIT_LIST_HEAD(&dfc->dfc_buffers);
> > >   	xfs_defer_create_intents(tp);
> > > @@ -690,7 +691,8 @@ int
> > >   xfs_defer_ops_capture_and_commit(
> > >   	struct xfs_trans		*tp,
> > >   	struct xfs_inode		*capture_ip,
> > > -	struct list_head		*capture_list)
> > > +	struct list_head		*capture_list,
> > > +	struct xfs_buf			*bp)
> > 
> > I wonder if xfs_defer_ops_capture should learn to pick up the inodes and
> > buffers to hold automatically from the transaction that's being
> > committed?  Seeing as xfs_defer_trans_roll already knows how to do that
> > across transaction rolls, and that's more or less the same thing we're
> > doing here, but in a much more roundabout way.
> I see, I suppose it could?  But it wouldnt be used in this case though, at
> least not yet.  I sort of got the impression that people like to see things
> added as they are needed, and then unused code culled where it can be.  I
> would think that if the need does arise though, b_delay would be easy to
> expand into list of xfs_delay_items or something similar to what
> xfs_defer_trans_roll has.

On further thought, I decided that log recovery of intent items really
ought to be able to hold the same number and type of resources across a
transaction "roll" that we can do at runtime, since there really ought
to be no difference.

I wrote a quick patch series to hoist the runtime code path to store the
held buffers and inodes during xfs_defer_trans_roll in an explicitly
named structure with separate save and restore helpers, then refactored
log recovery to use it.

This cleans up the API a bit so we that the defer capture system picks
up the held resources from the log item state just like a regular defer
roll.  It occurs to me that another way to handle intent item recovery
would be to create workqueue items for each recovered intent and then
kick the transactions off in background threads so that the end of all
the _recover functions would be xfs_trans_commit.  But that's a lot more
surgery on the recovery code and I don't want to go there right now.

Anyway, I'll send the RFC series shortly and cc you on it, in case you
want to incorporate it at the head of your series and nuke patch #2.

Note: the end of the xattri recovery function then becomes:

	error = xfs_defer_ops_capture_and_commit(tp, capture_list);
	if (error)
		/* pretend I handled the error */

	if (attr->xattri_dac.leaf_bp)
		xfs_buf_relse(attr->xattri_dac.leaf_bp);
	xfs_iunlock(ip, XFS_ILOCK_EXCL);
	xfs_irele(ip);
	return 0;

I think it's a bug that patch 4 doesn't release leaf_bp, since we
require code that holds a buffer across a commit to relse it explicitly.

--D

> 
> > 
> > >   {
> > >   	struct xfs_mount		*mp = tp->t_mountp;
> > >   	struct xfs_defer_capture	*dfc;
> > > @@ -703,6 +705,9 @@ xfs_defer_ops_capture_and_commit(
> > >   	if (!dfc)
> > >   		return xfs_trans_commit(tp);
> > > +	if (bp && bp->b_transp == tp)
> > > +		list_add_tail(&bp->b_delay, &dfc->dfc_buffers);
> > > +
> > >   	/* Commit the transaction and add the capture structure to the list. */
> > >   	error = xfs_trans_commit(tp);
> > >   	if (error) {
> > > diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h
> > > index 05472f71fffe..739f70d72fd5 100644
> > > --- a/fs/xfs/libxfs/xfs_defer.h
> > > +++ b/fs/xfs/libxfs/xfs_defer.h
> > > @@ -74,6 +74,7 @@ struct xfs_defer_capture {
> > >   	/* Deferred ops state saved from the transaction. */
> > >   	struct list_head	dfc_dfops;
> > > +	struct list_head	dfc_buffers;
> > >   	unsigned int		dfc_tpflags;
> > >   	/* Block reservations for the data and rt devices. */
> > > @@ -95,7 +96,8 @@ struct xfs_defer_capture {
> > >    * This doesn't normally happen except log recovery.
> > >    */
> > >   int xfs_defer_ops_capture_and_commit(struct xfs_trans *tp,
> > > -		struct xfs_inode *capture_ip, struct list_head *capture_list);
> > > +		struct xfs_inode *capture_ip, struct list_head *capture_list,
> > > +		struct xfs_buf *bp);
> > >   void xfs_defer_ops_continue(struct xfs_defer_capture *d, struct xfs_trans *tp,
> > >   		struct xfs_inode **captured_ipp);
> > >   void xfs_defer_ops_release(struct xfs_mount *mp, struct xfs_defer_capture *d);
> > > diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
> > > index 03159970133f..51ba8ee368ca 100644
> > > --- a/fs/xfs/xfs_bmap_item.c
> > > +++ b/fs/xfs/xfs_bmap_item.c
> > > @@ -532,7 +532,7 @@ xfs_bui_item_recover(
> > >   	 * Commit transaction, which frees the transaction and saves the inode
> > >   	 * for later replay activities.
> > >   	 */
> > > -	error = xfs_defer_ops_capture_and_commit(tp, ip, capture_list);
> > > +	error = xfs_defer_ops_capture_and_commit(tp, ip, capture_list, NULL);
> > >   	if (error)
> > >   		goto err_unlock;
> > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > > index 047bd6e3f389..29b4655a0a65 100644
> > > --- a/fs/xfs/xfs_buf.c
> > > +++ b/fs/xfs/xfs_buf.c
> > > @@ -233,6 +233,7 @@ _xfs_buf_alloc(
> > >   	init_completion(&bp->b_iowait);
> > >   	INIT_LIST_HEAD(&bp->b_lru);
> > >   	INIT_LIST_HEAD(&bp->b_list);
> > > +	INIT_LIST_HEAD(&bp->b_delay);
> > >   	INIT_LIST_HEAD(&bp->b_li_list);
> > >   	sema_init(&bp->b_sema, 0); /* held, no waiters */
> > >   	spin_lock_init(&bp->b_lock);
> > > diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
> > > index 6b0200b8007d..c51445705dc6 100644
> > > --- a/fs/xfs/xfs_buf.h
> > > +++ b/fs/xfs/xfs_buf.h
> > > @@ -151,6 +151,7 @@ struct xfs_buf {
> > >   	int			b_io_error;	/* internal IO error state */
> > >   	wait_queue_head_t	b_waiters;	/* unpin waiters */
> > >   	struct list_head	b_list;
> > > +	struct list_head	b_delay;	/* delayed operations list */
> > >   	struct xfs_perag	*b_pag;		/* contains rbtree root */
> > >   	struct xfs_mount	*b_mount;
> > >   	struct xfs_buftarg	*b_target;	/* buffer target (device) */
> > 
> > The bare list-conveyance machinery looks fine to me, but adding 16 bytes
> > to struct xfs_buf for something that only happens during log recovery is
> > rather expensive.  Can you reuse b_list for this purpose?  I think the
> > only user of b_list are the buffer delwri functions, which shouldn't be
> > active here since the xattr recovery mechanism (a) holds the buffer lock
> > and (b) doesn't itself use delwri buffer lists for xattr leaf blocks.
> > 
> > (The AIL uses delwri lists, but it won't touch a locked buffer.)
> > 
> Sure, it sounds like it would work, will try it out.
> 
> > > diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
> > > index 3f8a0713573a..046f21338c48 100644
> > > --- a/fs/xfs/xfs_extfree_item.c
> > > +++ b/fs/xfs/xfs_extfree_item.c
> > > @@ -637,7 +637,7 @@ xfs_efi_item_recover(
> > >   	}
> > > -	return xfs_defer_ops_capture_and_commit(tp, NULL, capture_list);
> > > +	return xfs_defer_ops_capture_and_commit(tp, NULL, capture_list, NULL);
> > >   abort_error:
> > >   	xfs_trans_cancel(tp);
> > > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> > > index 10562ecbd9ea..6a3c0bb16b69 100644
> > > --- a/fs/xfs/xfs_log_recover.c
> > > +++ b/fs/xfs/xfs_log_recover.c
> > > @@ -2465,6 +2465,7 @@ xlog_finish_defer_ops(
> > >   	struct list_head	*capture_list)
> > >   {
> > >   	struct xfs_defer_capture *dfc, *next;
> > > +	struct xfs_buf		*bp, *bnext;
> > >   	struct xfs_trans	*tp;
> > >   	struct xfs_inode	*ip;
> > >   	int			error = 0;
> > > @@ -2489,6 +2490,12 @@ xlog_finish_defer_ops(
> > >   			return error;
> > >   		}
> > > +		list_for_each_entry_safe(bp, bnext, &dfc->dfc_buffers, b_delay) {
> > > +			xfs_trans_bjoin(tp, bp);
> > > +			xfs_trans_bhold(tp, bp);
> > > +			list_del_init(&bp->b_delay);
> > > +		}
> > 
> > Why isn't this in xfs_defer_ops_continue, like the code that extracts
> > the inodes from the capture struct and hands them back to the caller?
> Its just what was discussed in the last review is all.  That does look like
> a better place for it though.  Will move there.
> 
> Allison
> 
> > 
> > > +
> > >   		/*
> > >   		 * Transfer to this new transaction all the dfops we captured
> > >   		 * from recovering a single intent item.
> > > diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c
> > > index 46904b793bd4..a6e7351ca4f9 100644
> > > --- a/fs/xfs/xfs_refcount_item.c
> > > +++ b/fs/xfs/xfs_refcount_item.c
> > > @@ -557,7 +557,7 @@ xfs_cui_item_recover(
> > >   	}
> > >   	xfs_refcount_finish_one_cleanup(tp, rcur, error);
> > > -	return xfs_defer_ops_capture_and_commit(tp, NULL, capture_list);
> > > +	return xfs_defer_ops_capture_and_commit(tp, NULL, capture_list, NULL);
> > >   abort_error:
> > >   	xfs_refcount_finish_one_cleanup(tp, rcur, error);
> > > diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c
> > > index 5f0695980467..8c70a4af80a9 100644
> > > --- a/fs/xfs/xfs_rmap_item.c
> > > +++ b/fs/xfs/xfs_rmap_item.c
> > > @@ -587,7 +587,7 @@ xfs_rui_item_recover(
> > >   	}
> > >   	xfs_rmap_finish_one_cleanup(tp, rcur, error);
> > > -	return xfs_defer_ops_capture_and_commit(tp, NULL, capture_list);
> > > +	return xfs_defer_ops_capture_and_commit(tp, NULL, capture_list, NULL);
> > >   abort_error:
> > >   	xfs_rmap_finish_one_cleanup(tp, rcur, error);
> > > -- 
> > > 2.25.1
> > > 



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux