Re: MD write performance issue - found Catalyst patches

Thomas Fjellstrom <tfjellstrom@xxxxxxx> · Thu, 29 Oct 2009 01:32:25 -0600

On Thu October 29 2009, Thomas Fjellstrom wrote:
> On Thu October 29 2009, Neil Brown wrote:
> > On Sunday October 18, markdelfman@xxxxxxxxxxxxxx wrote:
> > > We have tracked the performance drop to the attached two commits in
> > > 2.6.28.6.    The performance never fully recovers in later kernels so
> > > I presuming that the change in the write cache is still affecting MD
> > > today.
> > >
> > > The problem for us is that although we have slowly tracked it down,
> > > we have no understanding of linux at this level and simply wouldn?t
> > > know where go from this point.
> > >
> > > Considering this seems to only effect MD and not hardware based RAID
> > > (in our tests) I thought that this would be an appropriate place to
> > > post these patches and findings.
> > >
> > > There are 2 patches which impact MD performance via a filesystem:
> > >
> > > a) commit 66c85494570396661479ba51e17964b2c82b6f39 - write-back: fix
> > > nr_to_write counter
> > > b) commit fa76ac6cbeb58256cf7de97a75d5d7f838a80b32 - Fix page
> > > writeback thinko, causing Berkeley DB slowdown
> >
> > I've had a look at this and asked around and I'm afraid there doesn't
> > seem to be an easy answer.
> >
> > The most likely difference between 'before' and 'after' those patches
> > is that more pages are being written per call to generic_writepages in
> > the 'before' case.  This would generally improve throughput,
> > particularly with RAID5 which would get more full stripes.
> >
> > However that is largely a guess as the bugs which were fixed by the
> > patch could interact in interesting ways with XFS (which decrements
> > ->nr_to_write itself) and it isn't immediately clear to me that more
> > pages would be written...
> >
> > In any case, the 'after' code is clearly correct, so if throughput can
> > really be increased, the change should be somewhere else.
> >
> > What might be useful would be to instrument write_cache_pages to count
> > how many pages were written each time it calls.  You could either
> > print this number out every time or, if that creates too much noise,
> > print out an average ever 512 calls or similar.
> >
> > Seeing how this differs with and without the patches in question could
> > help understand what is going one and provide hints for how to fix it.
> 
> I don't suppose this causes "bursty" writeout like I've been seeing
>  lately? For some reason writes go full speed for a short while and then
>  just stop for a short time, which averages out to 2-4x slower than what
>  the array should be capable of.

At the very least, 2.6.26 doesn't have this issue. Speeds are lower than I 
was expecting (350MB/s write, 450MB/s read), but no where near as bad as 
later kernels. and there is no "bursty" behaviour. speeds are fairly 
constant throughout testing.

> > NeilBrown
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid"
> > in the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Thomas Fjellstrom
tfjellstrom@xxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html