Re: 3Ware 9550SX and latency/system responsiveness

Simon Banton <centos@xxxxxxxxxx> · Tue, 2 Oct 2007 14:19:39 +0100

At 12:30 +0200 2/10/07, matthias platzer wrote:

What I did to work around them was basically switching to XFS for 
everything except / (3ware say their cards are fast, but only on 
XFS) AND using very low nr_requests for every blockdev on the 3ware 
card.

Hi Matthias,

Thanks for this. In my CentOS 5 tests the nr_requests turned out by 
default to be 128, rather than the 8192 of CentOS 4.5. I'll have a go 
at reducing it still further.

If you can, you could also try _not_ putting the system disks on the 
3ware card, because additionally the 3ware driver/card gives writes 
priority.

I've noticed that kicking off a simulataneous pair of dd reads and 
writes from/to the RAID 1 array indicates that very clearly - only 
with cfq as the elevator did reads get any kind of look-in. Sadly, 
I'm not able to separate the system disks off as there's no on-board 
SATA on the mboard nor any room for inboard disks, the original 
intention was to provide the resilience of hardware RAID 1 for the 
entire machine.

People suggested the unresponsive system behaviour is because the 
cpu hanging in iowait for writing and then reading the system 
binaries won't happen till the writes are done, so the binaries 
should be on another io path.

Yup, that certainly seems to be what's happening. Wish I had another io path...

All this seem to be symptoms of a very complex issue consisting of 
kernel bugs/bad drivers/... and they seem to be worst on a AMD/3ware 
Combination.
here is another link:
http://bugzilla.kernel.org/show_bug.cgi?id=7372

Ouch - thanks for that link :-( Looks like I'm screwed big time.

S.
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos