Re: [PATCH] xfs: optimise away log forces on timestamp updates for fdatasync

Brian Foster <bfoster@xxxxxxxxxx> · Mon, 26 Oct 2015 07:49:29 -0400

On Mon, Oct 26, 2015 at 04:07:20PM +1100, Dave Chinner wrote:
> On Thu, Oct 22, 2015 at 01:36:19PM -0400, Brian Foster wrote:
> > On Wed, Oct 21, 2015 at 01:59:03PM +1100, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > > 
> > > xfs: timestamp updates cause excessive fdatasync log traffic
> ....
> > > +++ b/fs/xfs/xfs_file.c
> > > @@ -248,8 +248,10 @@ xfs_file_fsync(
> > >  	xfs_ilock(ip, XFS_ILOCK_SHARED);
> > >  	if (xfs_ipincount(ip)) {
> > >  		if (!datasync ||
> > > -		    (ip->i_itemp->ili_fields & ~XFS_ILOG_TIMESTAMP))
> > > +		    (ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP)) {
> > >  			lsn = ip->i_itemp->ili_last_lsn;
> > > +			ip->i_itemp->ili_fsync_fields = 0;
> > > +		}
> > 
> > Ok, so we check what's been logged since the last fsync that forced the
> > log. If anything other than the timestamp has been logged, we force the
> > log and clear the fields. Seems like a reasonable optimization to me.
> > 
> > One question... is it safe to clear the ili_fsync fields here if we have
> > parallel fsync()/fdatasync() calls coming in? This is under the shared
> > ilock, so assume that one fsync() comes in and finds non-timestamp
> > changes to flush. It grabs the lsn, clears the flags and calls the log
> > force. If an fdatasync() comes in before the log force completes,
> > shouldn't it wait?
> 
> Probably, but the only way to do that is to run a log force on that
> same lsn. Actually, it is safe to do that log force while holding
> the XFS_ILOCK (xfs_trans_commit() does that for synchronous
> transactions), so we should simply be able to do:
> 

Yeah, a parallel fsync on a single inode situation is probably not the
most likely thing. Thinking about it further, the case where multiple
inodes happen to have the same lsn might be more likely and takes the
shared ilock out of the equation.

> 	xfs_ilock(ip, XFS_ILOCK_SHARED);
> 	if (xfs_ipincount(ip)) {
> 		if (!datasync ||
> 		    (ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
> 			lsn = ip->i_itemp->ili_last_lsn;
> 	}
> 
> 	if (lsn) {
> 		error = _xfs_log_force_lsn(mp, lsn, XFS_LOG_SYNC, &log_flushed);
> 		ip->i_itemp->ili_fsync_fields = 0;
> 	}
> 	xfs_iunlock(ip, XFS_ILOCK_SHARED);
> 

This looks better to me. We send the caller through the log force
infrastructure until we know the log has been forced.

> 
> > Also, is it me or are we sending an unconditional flush in the hunk
> > following the log force call in xfs_file_fsync() (even if we've skipped
> > the log force)?
> 
> The flush is needed - fdatasync needs to guarantee the data is
> on stable storage even if no metadata needs to be written to the
> journal.
> 

Ok. Well it's too bad we don't get any feedback about what was written
from the filemap_write_and_wait_range() call. As it is, we send a flush
even if there's nothing to write back. I suppose that's also not a major
problem since an fsync() caller implies something was modified one way
or another. Thanks.

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs