Re: MD write performance issue - found Catalyst patches

Asdo <asdo@xxxxxxxxxxxxx> · Thu, 29 Oct 2009 09:08:53 +0100

Neil Brown wrote:
I've had a look at this and asked around and I'm afraid there doesn't
seem to be an easy answer.

The most likely difference between 'before' and 'after' those patches
is that more pages are being written per call to generic_writepages in
the 'before' case.  This would generally improve throughput,
particularly with RAID5 which would get more full stripes.

However that is largely a guess as the bugs which were fixed by the
patch could interact in interesting ways with XFS (which decrements
->nr_to_write itself) and it isn't immediately clear to me that more
pages would be written... 

In any case, the 'after' code is clearly correct, so if throughput can
really be increased, the change should be somewhere else.

Thank you Neil for looking into this

How can "writing less pages" be more correct than "writing more pages"?
I can see the first as an optimization to the second, however if this 
reduces throughput then the optimization doesn't work...
Isn't it possible to "fix" it so to write more pages and still be 
semantically correct?

Thomas Fjellstrom wrote:
I don't suppose this causes "bursty" writeout like I've been seeing lately? 
For some reason writes go full speed for a short while and then just stop 
for a short time, which averages out to 2-4x slower than what the array 
should be capable of.

I have definitely seen this bursty behaviour on 2.6.31.

It would be interesting to know what are the CPUs doing or waiting for 
in the pause times. But I am not a kernel expert :-( how could one check 
this?

Thank you
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html