Re: [PATCH V2] xfs: timestamp updates cause excessive fdatasync log traffic

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 1 Sep 2015 07:51:27 +1000

On Mon, Aug 31, 2015 at 05:40:04AM -0700, Sage Weil wrote:
> On Mon, 31 Aug 2015, Dave Chinner wrote:
> > After taking a tangent to find a tracepoint regression that was
> > getting in my way, I found that there was a significant pause
> > between the inode locking calls within xfs_file_fsync and the inode
> > locking calls on the buffered write. Roughly 8ms, in fact, on almost
> > every call. After adding a couple more test trace points into the
> > XFS fsync code, it turns out that a hardware cache flush is causing
> > the delay. That is, because we aren't doing log writes that trigger
> > cache flushes and FUA writes, we have to issue a
> > blkdev_issue_flush() call from xfs_file_fsync and that is taking 8ms
> > to complete.
> 
> This is where my understanding of block layer flushing really breaks down, 
> but in both cases we're issues flush requests to the hardware, right? Is 
> the difference that the log write is a FUA flush request with data, and 
> blkdev_issue_flush() issues a flush request without associated data?

Pretty much, though th elog write also does a cache flush before the
FUA write. i.e.  The log writes consist of a bio with data issued via:

	submit_bio(REQ_FUA | REQ_FLUSH | WRITE_SYNC, bio);

blkdev_issue_flush consists of an empty bio issued via:

	submit_bio(REQ_FLUSH | WRITE_SYNC, bio);

So from a block layer and filesystem point of view there is little
difference, and the only difference at the SCSI layer is the WRITE
w/ FUA that is issued after the cache flush in the log write case
(see https://lwn.net/Articles/400541/ fo a bit more background).

I haven't looked any deeper than this so far - I don't have time
right now to do so...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs