Re: [PATCH RFC] mm: Implement balance_dirty_pages() through waiting for flusher thread

Jan Kara <jack@xxxxxxx> · Wed, 23 Jun 2010 15:15:57 +0200



On Wed 23-06-10 08:29:32, Dave Chinner wrote:
> On Tue, Jun 22, 2010 at 04:02:59PM +0200, Jan Kara wrote:
> > > 2) most writeback will be submitted by one per-bdi-flusher, so no worry
> > >    of cache bouncing (this also means the per CPU counter error is
> > >    normally bounded by the batch size)
> >   Yes, writeback will be submitted by one flusher thread but the question
> > is rather where the writeback will be completed. And that depends on which
> > CPU that particular irq is handled. As far as my weak knowledge of HW goes,
> > this very much depends on the system configuration (i.e., irq affinity and
> > other things).
> 
> And how many paths to the storage you are using, how threaded the
> underlying driver is, whether it is using MSI to direct interrupts to
> multiple CPUs instead of just one, etc.
> 
> As we scale up we're more likely to see multiple CPUs doing IO
> completion for the same BDI because the storage configs are more
> complex in high end machines. Hence IMO preventing cacheline
> bouncing between submission and completion is a significant
> scalability concern.
  Thanks for details. I'm wondering whether we could assume that although
IO completion can run on several CPUs, it will be still a fairly limited
number of CPUs. If this is the case, we could then implement a per-cpu
counter that would additionally track number of CPUs modifying the counter
(the number of CPUs would get zeroed in ???_counter_sum). This way the
number of atomic operations won't be much higher (only one atomic inc when
a CPU updates the counter for the first time) and if only several CPUs
modify the counter, we would be able to bound the error much better.

								Honza
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html