Re: [PATCH 4/4] xfs: introduce inode unlink log item

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 01, 2020 at 10:32:19AM -0400, Brian Foster wrote:
> On Tue, Jun 23, 2020 at 07:50:15PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > 
> > Tracking dirty inodes via cluster buffers creates lock ordering
> > issues with logging unlinked inode updates direct to the cluster
> > buffer. The unlinked inode list is unordered, so we can lock cluster
> > buffers in random orders and that causes deadlocks.
> > 
> > To solve this problem, we really want to dealy locking the cluster
> > buffers until the pre-commit phase where we can order the buffers
> > correctly along with all the other inode cluster buffers that are
> > locked by the transaction. However, to do this we need to be able to
> > tell the transaction which inodes need to have there unlinked list
> > updated and what it should be updated to.
> > 
> > We can delay the buffer update to the pre-commit phase based on the
> > fact taht all unlinked inode list updates are serialised by the AGI
> > buffer. It will be locked into the transaction before the list
> > update starts, and will remain locked until the transaction commits.
> > Hence we can lock and update the cluster buffers safely any time
> > during the transaction and we are still safe from other racing
> > unlinked list updates.
> > 
> > The iunlink log item currently only exists in memory. we need a log
> > item to attach information to the transaction, but it's context
> > is completely owned by the transaction. Hence it is never formatted
> > or inserted into the CIL, nor is it seen by the journal, the AIL or
> > log recovery.
> > 
> > This makes it a very simple log item, and the changes makes results
> > in adding addition buffer log items to the transaction. Hence once
> > the iunlink log item has run it's pre-commit operation, it can be
> > dropped by the transaction and released.
> > 
> > The creation of this in-memory intent does not prevent us from
> > extending it in future to the journal to replace buffer based
> > logging of the unlinked list. Changing the format of the items we
> > write to the on disk journal is beyond the scope of this patchset,
> > hence we limit it to being in-memory only.
> > 
> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > ---
> >  fs/xfs/Makefile           |   1 +
> >  fs/xfs/xfs_inode.c        |  70 +++----------------
> >  fs/xfs/xfs_inode_item.c   |   3 +-
> >  fs/xfs/xfs_iunlink_item.c | 141 ++++++++++++++++++++++++++++++++++++++
> >  fs/xfs/xfs_iunlink_item.h |  24 +++++++
> >  fs/xfs/xfs_super.c        |  10 +++
> >  6 files changed, 189 insertions(+), 60 deletions(-)
> >  create mode 100644 fs/xfs/xfs_iunlink_item.c
> >  create mode 100644 fs/xfs/xfs_iunlink_item.h
> > 
> ...
> > diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
> > index 0494b907c63d..bc1970c37edc 100644
> > --- a/fs/xfs/xfs_inode_item.c
> > +++ b/fs/xfs/xfs_inode_item.c
> > @@ -488,8 +488,9 @@ xfs_inode_item_push(
> >  	ASSERT(iip->ili_item.li_buf);
> >  
> >  	if (xfs_ipincount(ip) > 0 || xfs_buf_ispinned(bp) ||
> > -	    (ip->i_flags & XFS_ISTALE))
> > +	    (ip->i_flags & XFS_ISTALE)) {
> >  		return XFS_ITEM_PINNED;
> > +	}
> 
> Spurious change..?

*nod*

Likely a left over from adding debug traceprints...

> >  	/* If the inode is already flush locked, we're already flushing. */
> >  	if (xfs_iflags_test(ip, XFS_IFLUSHING))
> > diff --git a/fs/xfs/xfs_iunlink_item.c b/fs/xfs/xfs_iunlink_item.c
> > new file mode 100644
> > index 000000000000..83f1dc81133b
> > --- /dev/null
> > +++ b/fs/xfs/xfs_iunlink_item.c
> > @@ -0,0 +1,141 @@
> ...
> > +
> > +static const struct xfs_item_ops xfs_iunlink_item_ops = {
> > +	.flags		= XFS_ITEM_RELEASE_WHEN_COMMITTED,
> > +	.iop_release	= xfs_iunlink_item_release,
> 
> Presumably we need the release callback for transaction abort, but the
> flag looks unnecessary. That triggers a release on commit to the on-disk
> log, which IIUC should never happen for this item.

You are probably right - I didn't look that further than "it should
be freed at commit time" and the flag name implies it is freed at
commit time.

Which, of course, then raises the question: "Which commit are we
talking about here?". But because it's RFC work at this point I
didn't bother chasing that detail down because the code worked and I
had other things to do.....

> > +	.iop_sort	= xfs_iunlink_item_sort,
> > +	.iop_precommit	= xfs_iunlink_item_precommit,
> > +};
> > +
> > +
> > +/*
> > + * Initialize the inode log item for a newly allocated (in-core) inode.
> > + *
> > + * Inode extents can only reside within an AG. Hence specify the starting
> > + * block for the inode chunk by offset within an AG as well as the
> > + * length of the allocated extent.
> > + *
> > + * This joins the item to the transaction and marks it dirty so
> > + * that we don't need a separate call to do this, nor does the
> > + * caller need to know anything about the iunlink item.
> > + */
> 
> Looks like some copy/paste remnants in the comment.

Yup, I did just copy-pasta at lot of stuff around here...

Cheers,

Dave.

-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux