Re: [PATCH 2/6] writeback: wb_start_writeback() should use WB_SYNC_ALL for WB_REASON_SYNC

Jens Axboe <axboe@xxxxxx> · Tue, 22 Mar 2016 16:07:18 -0600

On 03/22/2016 04:04 PM, Dave Chinner wrote:
On Tue, Mar 22, 2016 at 03:40:28PM -0600, Jens Axboe wrote:
On 03/22/2016 03:34 PM, Dave Chinner wrote:
On Tue, Mar 22, 2016 at 11:55:16AM -0600, Jens Axboe wrote:
If you call sync, the initial call to wakeup_flusher_threads() ends up
calling wb_start_writeback() with reason=WB_REASON_SYNC, but
wb_start_writeback() always uses WB_SYNC_NONE as the writeback mode.
Ensure that we use WB_SYNC_ALL for a sync operation.

This seems wrong to me. We want background write to happen as
quickly as possible and /not block/ when we first kick sync.

It's not going to block. wakeup_flusher_threads() async queues
writeback work through wb_start_writeback().

The flusher threads block, not the initial wakeup. e.g. they will
now block waiting for data writeback to complete before writing the
inode. i.e. this code in __writeback_single_inode() is now triggered
by the background flusher:

         /*
          * Make sure to wait on the data before writing out the metadata.
          * This is important for filesystems that modify metadata on data
          * I/O completion. We don't do it for sync(2) writeback because it has a
          * separate, external IO completion path and ->sync_fs for guaranteeing
          * inode metadata is written back correctly.
          */
         if (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync) {
                 int err = filemap_fdatawait(mapping);
                 if (ret == 0)
                         ret = err;
         }

Yeah, that's not ideal, for this case we'd really like something that 
WB_SYNC_ALL_NOWAIT...

It also changes the writeback chunk size in write_cache_pages(), so
instead of doing a bit of writeback from all dirty inodes, it tries
to write everything from each inode in turn (see
writeback_chunk_size()) which will further exacerbate the wait
above.

But that part is fine, if it wasn't for the waiting.

The latter blocking passes of sync use WB_SYNC_ALL to ensure that we
block waiting for all remaining IO to be issued and waited on, but
the background writeback doesn't need to do this.

That's fine, they can get to wait on the previously issued IO, which
was properly submitted with WB_SYNC_ALL.

Maybe I'm missing your point?

Making the background flusher block and wait for data makes it
completely ineffective in speeding up sync() processing...

Agree, we should not wait on the pages individually, we want them 
submitted and then waited on. And I suppose it's no differently than 
handling the normal buffered write from an application, which then gets 
waited on with fsync() or similar. So I can drop this patch, it'll work 
fine without it.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html