Re: [PATCH 03/30] xfs: add an inode item lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 04, 2020 at 11:54:56AM +1000, Dave Chinner wrote:
> On Tue, Jun 02, 2020 at 12:34:44PM -0400, Brian Foster wrote:
> > On Tue, Jun 02, 2020 at 07:42:24AM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > ...
> > > @@ -122,23 +117,30 @@ xfs_trans_log_inode(
> > >  	 * set however, then go ahead and bump the i_version counter
> > >  	 * unconditionally.
> > >  	 */
> > > -	if (!test_and_set_bit(XFS_LI_DIRTY, &ip->i_itemp->ili_item.li_flags) &&
> > > -	    IS_I_VERSION(VFS_I(ip))) {
> > > -		if (inode_maybe_inc_iversion(VFS_I(ip), flags & XFS_ILOG_CORE))
> > > -			flags |= XFS_ILOG_CORE;
> > > +	if (!test_and_set_bit(XFS_LI_DIRTY, &iip->ili_item.li_flags)) {
> > > +		if (IS_I_VERSION(inode) &&
> > > +		    inode_maybe_inc_iversion(inode, flags & XFS_ILOG_CORE))
> > > +			iversion_flags = XFS_ILOG_CORE;
> > >  	}
> > >  
> > > -	tp->t_flags |= XFS_TRANS_DIRTY;
> > > +	/*
> > > +	 * Record the specific change for fdatasync optimisation. This allows
> > > +	 * fdatasync to skip log forces for inodes that are only timestamp
> > > +	 * dirty. We do this before the change count so that the core being
> > > +	 * logged in this case does not impact on fdatasync behaviour.
> > > +	 */
> > 
> > We no longer do this before the change count logic so that part of the
> > comment is bogus.
> 
> Ugh. Another 6 patch conflicts to resolve coming right up....
> 
> > > +	spin_lock(&iip->ili_lock);
> > > +	iip->ili_fsync_fields |= flags;
> > >  
> > >  	/*
> > > -	 * Always OR in the bits from the ili_last_fields field.
> > > -	 * This is to coordinate with the xfs_iflush() and xfs_iflush_done()
> > > -	 * routines in the eventual clearing of the ili_fields bits.
> > > -	 * See the big comment in xfs_iflush() for an explanation of
> > > -	 * this coordination mechanism.
> > > +	 * Always OR in the bits from the ili_last_fields field.  This is to
> > > +	 * coordinate with the xfs_iflush() and xfs_iflush_done() routines in
> > > +	 * the eventual clearing of the ili_fields bits.  See the big comment in
> > > +	 * xfs_iflush() for an explanation of this coordination mechanism.
> > >  	 */
> > > -	flags |= ip->i_itemp->ili_last_fields;
> > > -	ip->i_itemp->ili_fields |= flags;
> > > +	iip->ili_fields |= (flags | iip->ili_last_fields |
> > > +			    iversion_flags);
> > > +	spin_unlock(&iip->ili_lock);
> > >  }
> > >  
> > >  int
> > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > > index 403c90309a8ff..0abf770b77498 100644
> > > --- a/fs/xfs/xfs_file.c
> > > +++ b/fs/xfs/xfs_file.c
> > > @@ -94,6 +94,7 @@ xfs_file_fsync(
> > >  {
> > >  	struct inode		*inode = file->f_mapping->host;
> > >  	struct xfs_inode	*ip = XFS_I(inode);
> > > +	struct xfs_inode_log_item *iip = ip->i_itemp;
> > >  	struct xfs_mount	*mp = ip->i_mount;
> > >  	int			error = 0;
> > >  	int			log_flushed = 0;
> > > @@ -137,13 +138,15 @@ xfs_file_fsync(
> > >  	xfs_ilock(ip, XFS_ILOCK_SHARED);
> > >  	if (xfs_ipincount(ip)) {
> > >  		if (!datasync ||
> > > -		    (ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
> > > -			lsn = ip->i_itemp->ili_last_lsn;
> > > +		    (iip->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
> > > +			lsn = iip->ili_last_lsn;
> > 
> > I am still a little confused why the lock is elided in other read cases,
> > such as this one or perhaps the similar check in xfs_bmbt_to_iomap()..?
> 
> They are still all serialised against those field changing the same
> way they currently are. i.e. they are all under the ILOCK, so
> changes that occur during IO submission will never occur.  Hence the
> only thing that we can race with is IO completion clearing the
> fields, in which case the subsequent operations if the item is now
> clean turn into no-ops.
> 
> i.e:
> - ILOCK serialises transaction logging vs IO submission.
> - iflock serialises IO submission vs IO completion.
> - Nothing serialises transaction logging vs IO completion.
> 
> The latter is what the ili_lock is intended for; everything else is
> still protected by the existing serialisation mechanisms that they
> are now. Any races in areas outside xfs_trans_log_inode() vs
> xfs_iflush_done/abort() is largely outside the scope of this patch
> and this lock...
> 

Ok, but in this particular case we use the ili_lock around the
ili_fsync_fields reset (but not the read in the same function), and that
field is cleared when the inode is flushed. Is the lock used here for
the abort case?

I think I'll probably have to get through the rest of the series, see
how the lock is used with the logging changes in place, and then come
back and see if I can grok this aspect of it a little better..

> > Similarly, it looks like we set the ili_[flush|last]_lsn fields outside
> > of this lock (though last_lsn looks like it's also covered by ilock),
> > yet the update to the inode_log_item struct implies they should be
> > protected. What's the intent there?
> 
> The lsn fields are updated via xfs_trans_ail_lsn_copy(), which on 32
> bit systems takes the AIL lock, and I don't think it's a good idea
> to put the AIL lock inside the inode item lock.
> 

Ok.

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux