[PATCH 0/6] writeback fixes - slow unmount and others

Christoph Hellwig <hch@xxxxxx> · Tue, 8 Jun 2010 18:14:24 +0200

I've started looking into the slow unmount issue again, and digged a
bit deeper into the writeback code to look at the root cause.

All inode writeback is performed by writeback_inodes_wb, which
has the following callgraph (annotated by a few important facts):

writeback_inodes_wb
  <- writeback_inodes_wbc
       <- bdi_flush_io			(WB_SYNC_NONE, no sb)
       <- balance_dirty_pages		(WB_SYNC_NONE, no sb)
  <- wb_writeback
       <- wb_check_old_data_flush	(WB_SYNC_NONE, no sb)
       <- wb_do_writeback
           <- bdi_writeback_task	\
	   <- bdi_start_fn		| take bdi_work item from list
	   <- bdi_forker_task		/

So we either get a bdi_work item submitted by bdi_queue_work, or
we do WB_SYNC_NONE I/O ourselves without specifying a superblock.

For bdi_queue_work the same annotated callchain looks like:

bdi_queue_work
  <- bdi_sync_writeback			(WB_SYNC_ALL, mandatory sb, onstack)
       <- sync_inodes_sb		(..)
  <- bdi_alloc_queue_work
       <- bdi_start_writeback		(WB_SYNC_NONE, optional sb)
            <- balance_dirty_pages	(..          , no sb)
	    <- laptop_mode_timer_fn	(..          , no sb)
	    <- writeback_inodes_sb	(..          , mandatory sb)
       <- bdi_writeback_all		(WB_SYNC_NONE, optional sb)
            <- wakeup_flusher_threads	(..          , no sb)

So in general we can say that the case of WB_SYNC_NONE without a
superblock is the normal case.  And now if we look at the callers of
the functions that take a superblock we see that this is basically
sync_filesystem, which always has s_umount held.  So instead of
trying to add hacks for when it is held, we can assume that we
always have it held if a superblock is specified, and fix up the
two callers inside filesystems that crept in recently to also
hold it.  With that we can replace various ad-hoc checks with
a very clear scheme that only checks the oncase caller, and if
we have a superblock or not after switching all writeback
that has a superblock to submit the I/O on stack.

After my series the annotated callgraph for bdi_queue_work is the
following:

bdi_queue_work
  <- bdi_queue_work_onstack		 (WB_SYNC_ALL/NONE, mandatory sb, onstack)
       <- sync_inodes_sb		 (WB_SYNC_ALL, ..)
       <- writeback_inodes_sb		 (WB_SYNC_NONE, ..)
  <- bdi_alloc_queue_work
       <- bdi_start_writeback		 (WB_SYNC_NONE, no sb)
       <- bdi_start_background_writeback (WB_SYNC_NONE, no sb)
       <- wakeup_flusher_threads	 (WB_SYNC_NONE, no sb)

So it's a lot simpler and just as important flatter.  But when we look
how we use the onstack / sb specified callers I'm not sure this is the
optimal case yet.  We're beeing called from synchronous callers (
we need to wait for completion for the first WB_SYNC_NONE pass later on,
too), so there really is no need to hand this work off to the flusher
threads.  In the end it might be much simpler to just issue this I/O
from the caller context and remove the code for the on-stack
submissions, but that's left for a later patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html