Re: [PATCH 4/7] xfs: automatic freeing of freshly allocated unwritten space

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 05, 2023 at 02:47:50PM +1100, Dave Chinner wrote:
> On Tue, Sep 26, 2023 at 04:32:09PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@xxxxxxxxxx>
> > 
> > As mentioned in the previous commit, online repair wants to allocate
> > space to write out a new metadata structure, and it also wants to hedge
> > against system crashes during repairs by logging (and later cancelling)
> > EFIs to free the space if we crash before committing the new data
> > structure.
> > 
> > Therefore, create a trio of functions to schedule automatic reaping of
> > freshly allocated unwritten space.  xfs_alloc_schedule_autoreap creates
> > a paused EFI representing the space we just allocated.  Once the
> > allocations are made and the autoreaps scheduled, we can start writing
> > to disk.
> > 
> > If the writes succeed, xfs_alloc_cancel_autoreap marks the EFI work
> > items as stale and unpauses the pending deferred work item.  Assuming
> > that's done in the same transaction that commits the new structure into
> > the filesystem, we guarantee that either the new object is fully
> > visible, or that all the space gets reclaimed.
> > 
> > If the writes succeed but only part of an extent was used, repair must
> > call the same _cancel_autoreap function to kill the first EFI and then
> > log a new EFI to free the unused space.  The first EFI is already
> > committed, so it cannot be changed.
> > 
> > For full extents that aren't used, xfs_alloc_commit_autoreap will
> > unpause the EFI, which results in the space being freed during the next
> > _defer_finish cycle.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> > ---
> >  fs/xfs/libxfs/xfs_alloc.c |  104 +++++++++++++++++++++++++++++++++++++++++++--
> >  fs/xfs/libxfs/xfs_alloc.h |   12 +++++
> >  fs/xfs/xfs_extfree_item.c |   11 +++--
> >  3 files changed, 120 insertions(+), 7 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > index 295d11a27f632..c1ee1862cc1af 100644
> > --- a/fs/xfs/libxfs/xfs_alloc.c
> > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > @@ -2501,14 +2501,15 @@ xfs_defer_agfl_block(
> >   * Add the extent to the list of extents to be free at transaction end.
> >   * The list is maintained sorted (by block number).
> >   */
> > -int
> > -xfs_free_extent_later(
> > +static int
> > +__xfs_free_extent_later(
> >  	struct xfs_trans		*tp,
> >  	xfs_fsblock_t			bno,
> >  	xfs_filblks_t			len,
> >  	const struct xfs_owner_info	*oinfo,
> >  	enum xfs_ag_resv_type		type,
> > -	bool				skip_discard)
> > +	bool				skip_discard,
> > +	struct xfs_defer_pending	**dfpp)
> 
> I was happy that __xfs_free_extent_later() went away in the last
> patch. 
> 
> I am sad that it has immediately come back....
> 
> Can we please name this after what it does? Say
> xfs_defer_extent_free()?

Done.  Thank you for a better name! :)

> >  {
> >  	struct xfs_extent_free_item	*xefi;
> >  	struct xfs_mount		*mp = tp->t_mountp;
> > @@ -2556,10 +2557,105 @@ xfs_free_extent_later(
> >  			XFS_FSB_TO_AGBNO(tp->t_mountp, bno), len);
> >  
> >  	xfs_extent_free_get_group(mp, xefi);
> > -	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_FREE, &xefi->xefi_list);
> > +	*dfpp = xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_FREE, &xefi->xefi_list);
> >  	return 0;
> >  }
> >  
> > +int
> > +xfs_free_extent_later(
> > +	struct xfs_trans		*tp,
> > +	xfs_fsblock_t			bno,
> > +	xfs_filblks_t			len,
> > +	const struct xfs_owner_info	*oinfo,
> > +	enum xfs_ag_resv_type		type,
> > +	bool				skip_discard)
> > +{
> > +	struct xfs_defer_pending	*dontcare = NULL;
> > +
> > +	return __xfs_free_extent_later(tp, bno, len, oinfo, type, skip_discard,
> > +			&dontcare);
> > +}
> > +
> > +/*
> > + * Set up automatic freeing of unwritten space in the filesystem.
> > + *
> > + * This function attached a paused deferred extent free item to the
> > + * transaction.  Pausing means that the EFI will be logged in the next
> > + * transaction commit, but the pending EFI will not be finished until the
> > + * pending item is unpaused.
> > + *
> > + * If the system goes down after the EFI has been persisted to the log but
> > + * before the pending item is unpaused, log recovery will find the EFI, fail to
> > + * find the EFD, and free the space.
> > + *
> > + * If the pending item is unpaused, the next transaction commit will log an EFD
> > + * without freeing the space.
> > + *
> > + * Caller must ensure that the tp, fsbno, len, oinfo, and resv flags of the
> > + * @args structure are set to the relevant values.
> > + */
> > +int
> > +xfs_alloc_schedule_autoreap(
> > +	const struct xfs_alloc_arg	*args,
> > +	bool				skip_discard,
> > +	struct xfs_alloc_autoreap	*aarp)
> > +{
> > +	int				error;
> > +
> > +	error = __xfs_free_extent_later(args->tp, args->fsbno, args->len,
> > +			&args->oinfo, args->resv, skip_discard, &aarp->dfp);
> > +	if (error)
> > +		return error;
> > +
> > +	xfs_defer_item_pause(args->tp, aarp->dfp);
> > +	return 0;
> > +}
> 
> Then this becomes much more readable - xfs_defer_free_extent()
> returns the defer_pending work item for the EFI, which we then
> immediately pause....

<nod>

> > +
> > +/*
> > + * Cancel automatic freeing of unwritten space in the filesystem.
> > + *
> > + * Earlier, we created a paused deferred extent free item and attached it to
> > + * this transaction so that we could automatically roll back a new space
> > + * allocation if the system went down.  Now we want to cancel the paused work
> > + * item by marking the EFI stale so we don't actually free the space, unpausing
> > + * the pending item and logging an EFD.
> > + *
> > + * The caller generally should have already mapped the space into the ondisk
> > + * filesystem.  If the reserved space was partially used, the caller must call
> > + * xfs_free_extent_later to create a new EFI to free the unused space.
> > + */
> > +void
> > +xfs_alloc_cancel_autoreap(
> > +	struct xfs_trans		*tp,
> > +	struct xfs_alloc_autoreap	*aarp)
> > +{
> > +	struct xfs_defer_pending	*dfp = aarp->dfp;
> > +	struct xfs_extent_free_item	*xefi;
> > +
> > +	if (!dfp)
> > +		return;
> > +
> > +	list_for_each_entry(xefi, &dfp->dfp_work, xefi_list)
> > +		xefi->xefi_flags |= XFS_EFI_STALE;
> > +
> > +	xfs_defer_item_unpause(tp, dfp);
> > +}
> 
> Hmmmm. I see what you are trying to do here, though I'm not sure
> that "stale" is the right word to describe it. The EFI has been
> cancelled, so we want and EFD to be logged without freeing the
> extent.
> 
> To me, "stale" means "contents longer valid, do not touch" as per
> XFS_ISTALE, xfs_buf_stale(), etc
> 
> Whereas we use "cancelled" to indicate that the pending operations
> on the buffer should not be actioned, such as XFS_BLF_CANCEL buffer
> items in log recovery to indicate the buffer has been freed and
> while it is cancelled we should not replay the changes found in the
> log for that buffer....
> 
> Hence I think this is better named XFS_EFI_CANCELLED, and it also
> matches what the calling function is doing - cancelling the autoreap
> of the extent....

Funny story -- CANCELLED was the original name for this flag.  I'll put
it back.

> > diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
> > index 9e7b58f3566c0..98c2667d369e8 100644
> > --- a/fs/xfs/xfs_extfree_item.c
> > +++ b/fs/xfs/xfs_extfree_item.c
> > @@ -392,9 +392,14 @@ xfs_trans_free_extent(
> >  	trace_xfs_bmap_free_deferred(tp->t_mountp, xefi->xefi_pag->pag_agno, 0,
> >  			agbno, xefi->xefi_blockcount);
> >  
> > -	error = __xfs_free_extent(tp, xefi->xefi_pag, agbno,
> > -			xefi->xefi_blockcount, &oinfo, xefi->xefi_agresv,
> > -			xefi->xefi_flags & XFS_EFI_SKIP_DISCARD);
> > +	if (xefi->xefi_flags & XFS_EFI_STALE) {
> > +		error = 0;
> > +	} else {
> > +		error = __xfs_free_extent(tp, xefi->xefi_pag, agbno,
> > +				xefi->xefi_blockcount, &oinfo,
> > +				xefi->xefi_agresv,
> > +				xefi->xefi_flags & XFS_EFI_SKIP_DISCARD);
> > +	}
> 
> Just init error to 0 when it is declared, then this becomes:
> 
> 	if (!(xefi->xefi_flags & XFS_EFI_CANCELLED)) {
> 		error = __xfs_free_extent(tp, xefi->xefi_pag, agbno,
> 				xefi->xefi_blockcount, &oinfo,
> 				xefi->xefi_agresv,
> 				xefi->xefi_flags & XFS_EFI_SKIP_DISCARD);
> 	}

<nod> Eventually the rt reflink patch will trip over this (since rtdev
vs. datadev selection is also an xefi_flag) but that'll come later.

> Otherwise the code looks ok.

Yay!

--D

> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux