Re: IO less throttling and cgroup aware writeback (Was: Re: [Lsf] Preliminary Agenda and Activities for LSF)

Dave Chinner <david@xxxxxxxxxxxxx> · Fri, 1 Apr 2011 11:55:22 +1100

On Thu, Mar 31, 2011 at 07:43:27PM -0400, Chris Mason wrote:
> Excerpts from Dave Chinner's message of 2011-03-31 18:14:25 -0400:
> > On Thu, Mar 31, 2011 at 10:34:03AM -0400, Chris Mason wrote:
> > > Excerpts from Vivek Goyal's message of 2011-03-31 10:16:37 -0400:
> > > > On Thu, Mar 31, 2011 at 09:20:02AM +1100, Dave Chinner wrote:
> > > > > There are plans to move the bdi-flusher threads to work queues, and
> > > > > once that is done all your concerns about blocking and parallelism
> > > > > are pretty much gone because it's trivial to have multiple writeback
> > > > > works in progress at once on the same bdi with that infrastructure.
> > > > 
> > > > Will this essentially not nullify the advantage of IO less throttling?
> > > > I thought that we did not want have multiple threads doing writeback
> > > > at the same time to avoid number of seeks and achieve better throughput.
> > > 
> > > Work queues alone are probably not appropriate, at least for spinning
> > > storage.  It will introduce seeks into what would have been
> > > sequential writes.  I had to make the btrfs worker thread pools after
> > > having a lot of trouble cramming writeback into work queues.
> > 
> > That was before the cmwq infrastructure, right? cmwq changes the
> > behaviour of workqueues in such a way that they can simply be
> > thought of as having a thread pool of a specific size....
> > 
> > As a strict translation of the existing one flusher thread per bdi,
> > then only allowing one work at a time to be issued (i.e. workqueue
> > concurency of 1) would give the same behaviour without having all
> > the thread management issues. i.e. regardless of the writeback
> > parallelism mechanism we have the same issue of managing writeback
> > to minimise seeking. cmwq just makes the implementation far simpler,
> > IMO.
> > 
> > As to whether that causes seeks or not, that depends on how we are
> > driving the concurrent works/threads. If we drive a concurrent work
> > per dirty cgroup that needs writing back, then we achieve the
> > concurrency needed to make the IO scheduler appropriately throttle
> > the IO. For the case of no cgroups, then we still only have a single
> > writeback work in progress at a time and behaviour is no different
> > to the current setup. Hence I don't see any particular problem with
> > using workqueues to acheive the necessary writeback parallelism that
> > cgroup aware throttling requires....
> 
> Yes, as long as we aren't trying to shotgun style spread the
> inodes across a bunch of threads, it should work well enough.  The trick
> will just be making sure we don't end up with a lot of inode
> interleaving in the delalloc allocations.

That's a problem for any concurrent writeback mechanism as it passes
through the filesystem. It comes down to filesystems also needing to
have either concurrency- or cgroup-aware allocation mechanisms. It's
just another piece of the puzzle, really.

In the case of XFS, cgroup awareness could be as simple as as simple
as associating each cgroup with a specific allocation group and
keeping each cgroup as isolated as possible. There is precedence for
doing this in XFS - the filestreams allocator makes these sorts of
dynamic associations on a per-directory basis.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html