Re: [PATCH RFC 1/2] xfs: return committed status from xfs_trans_roll()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 28, 2015 at 10:40:09AM +1000, Dave Chinner wrote:
> [ reply to both patches in one reply, because it's related. ]
> 
> On Thu, Jul 23, 2015 at 04:13:29PM -0400, Brian Foster wrote:
> > Some callers need to make error handling decisions based on whether the
> > current transaction successfully committed or not. Rename
> > xfs_trans_roll(), add a new parameter and provide a wrapper to preserve
> > existing callers.
> > 
> > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
> > ---
> >  fs/xfs/xfs_trans.c | 15 +++++++++++++--
> >  fs/xfs/xfs_trans.h |  1 +
> >  2 files changed, 14 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> > index 0582a27..a0ab1da 100644
> > --- a/fs/xfs/xfs_trans.c
> > +++ b/fs/xfs/xfs_trans.c
> > @@ -1019,9 +1019,10 @@ xfs_trans_cancel(
> >   * chunk we've been working on and get a new transaction to continue.
> >   */
> >  int
> > -xfs_trans_roll(
> > +__xfs_trans_roll(
...
> So xfs_qm_dqalloc(), on error, simply cancels the freelist and
> returns the error - it ignores the committed flag. So the committed
> flag is used to determine how to handle the dquot buffer:
> 
>         xfs_trans_bhold(tp, bp);
> 
>         if ((error = xfs_bmap_finish(tpp, &flist, &committed))) {
>                 goto error1;
>         }
> 
>         if (committed) {
>                 tp = *tpp;
>                 xfs_trans_bjoin(tp, bp);
>         } else {
>                 xfs_trans_bhold_release(tp, bp);
>         }
> 
> Now, xfs_trans_bhold() simply sets the BLI_HOLD flag on the buffer,
> and transaction commit clears it. That means we could call it
> unconditionally, and all we have to do is remove an assert(BLI_HOLD)
> from xfs_trans_bhold_release() to enable that. Given this is the
> only caller of xfs_trans_bhold_release(), I don't see a problem with
> doing that. Nor is there a problem with multiple joins of an object
> to a transaction, so we could simply make this:
> 
> 	xfs_trans_bhold(tp, bp);
> 	error = xfs_bmap_finish(tpp, &flist, &committed);
> 	if (error)
> 		goto error1;
> 	tp = *tpp;
> 	xfs_trans_bjoin(tp, bp);
> 	xfs_trans_bhold_release(tp, bp);
> 
> And the committed flag is not necessary. With this, we can hide
> everything to do with error on commit vs error after commit inside
> xfs_bmap_finish()...
> 

I'm starting to look back at this and try to understand the intent here
(FYI, I've appended some of the xfs_bmap_cancel() fixes to v3 of the
EFI/EFD series, to be posted shortly). Is the proposition that we should
be able to remove the committed parameter from xfs_bmap_finish()
entirely?

If so, the buffer hold technique makes some sense here, but not
necessarily the potential for multiple joins of the object to the
transaction that this kind of usage introduces. At least, there are
currently assertions to protect against that kind of thing in the
buffer/inode join handlers that would have to be removed. In the case of
a buffer join, it looks like there's some reference counting that would
have to be rectified against this type of usage (I take it there is some
appropriate use for this that I'm not familiar with). Both the buffer
and inode cases also look like they'd introduce multiple calls to
xfs_trans_add_item(), which I'm not sure is totally safe at a glance.
The fact that an xfs_log_item points to a single xfs_log_item_desc looks
a bit suspicious to me, but I'd have to take a harder look at that.

Even if it were completely safe, IMO we obfuscate the context of code as
shown above because the object could be added to the new transaction or
re-added to the original depending on whether the previous transaction
freed some blocks or not. In other words, I think a multiple join of an
object has to be made completely idempotent in order to justify use of a
pattern like that shown above. I'm not sure it's worth doing that just
to kill a flag. Thoughts?

Brian

> >  	efd = xfs_trans_get_efd(*tp, efi, flist->xbf_count);
> >  	for (free = flist->xbf_first; free != NULL; free = next) {
> >  		next = free->xbfi_next;
> > -		if ((error = xfs_free_extent(*tp, free->xbfi_startblock,
> > -				free->xbfi_blockcount))) {
> > +
> > +		/*
> > +		 * Free the extent if the above trans roll hasn't failed and log
> > +		 * the EFD before handling errors from either call to ensure the
> > +		 * EFI reference is accounted for in the tp. Otherwise, the EFI
> > +		 * is never released on abort and pins the AIL indefinitely.
> > +		 */
> > +		if (!error)
> > +			error = xfs_free_extent(*tp, free->xbfi_startblock,
> > +						free->xbfi_blockcount);
> > +		xfs_trans_log_efd_extent(*tp, efd, free->xbfi_startblock,
> > +					 free->xbfi_blockcount);
> > +		if (error) {
> >  			/*
> >  			 * The bmap free list will be cleaned up at a
> >  			 * higher level.  The EFI will be canceled when
> > @@ -118,11 +134,10 @@ xfs_bmap_finish(
> >  						   SHUTDOWN_META_IO_ERROR);
> >  			return error;
> >  		}
> > -		xfs_trans_log_efd_extent(*tp, efd, free->xbfi_startblock,
> > -			free->xbfi_blockcount);
> >  		xfs_bmap_del_free(flist, NULL, free);
> >  	}
> > -	return 0;
> > +
> > +	return error;
> 
> This loop doesn't obviously do the right thing now. It
> will log the first extent into the EFD and then trigger a shutdown
> and return. The extent count in the EFD may not match the
> extent count on the EFI, so releasing the EFD at this point may not
> release all the extents in the EFI and hence not release the EFI.
> 
> I think I'd prefer to see a xfs_trans_cancel_efi() call to handle
> this error path rather than having to go through the efd to release
> the reference on the EFI. i.e.
> 
> 	error = __xfs_trans_roll(tp, NULL, &committed);
> 	if (error) {
> 		if (committed) {
> 			if (!XFS_FORCED_SHUTDOWN(mp))
> 				xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
> 			xfs_efi_cancel(tp, efi);
> 		}
> 		return error;
> 	}
> 
> /*
>  * Cancel a committed EFI after triggering a shutdown. We need to do
>  * this so that the committed EFI is removed from the AIL and freed,
>  * otherwise unmount will hang due to a non-empty AIL.
>  */
> xfs_efi_cancel()
> {
> 	ASSERT(XFS_FORCED_SHUTDOWN(mp));
> 	xfs_efi_release(efi, efi->nextents);
> }
> 
> Doing this means there's less complexity on the EFD abort side of
> things, as we don't have to now handle this failure case via
> transaction completion callback interface.
> 
> Hmmm - something I just noticed: if we only have one EFD per EFI,
> why do we do we have that layer of extent counting before dropping
> real references?
> 
> >  xfs_efd_item_unlock(
> >  	struct xfs_log_item	*lip)
> >  {
> > -	if (lip->li_flags & XFS_LI_ABORTED)
> > -		xfs_efd_item_free(EFD_ITEM(lip));
> > +	struct xfs_efd_log_item	*efdp = EFD_ITEM(lip);
> > +
> > +	if (lip->li_flags & XFS_LI_ABORTED) {
> > +		xfs_efi_release(efdp->efd_efip, efdp->efd_format.efd_nextents);
> > +		xfs_efd_item_free(efdp);
> > +	}
> >  }
> 
> i.e. we always call xfs_efi_release() with efi_nextents or
> efd_nextents, which are always the same, and so we never partially
> complete an EFI. Should we just kill that layer, as it does tend to
> complicate the EFI release code?
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs



[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux