Martin Knoblauch wrote:
We are experiencing responsiveness problems (and higher than expected
load) when the system is under combined memory+network+disk-IO stress.
^^^^^^^^^^^^^^^
First, I'd check the paging with `vmstat 5` ... if you see excessive SI
(swap in/second), you need more physical memory, no amount of dinking
with vm parameters can change this.
If you're not seeing excessive paging, I'd be inclined to monitor the
disk IO with `iostat -x 5`... if the avgqu-sz and/or await on any
device is high, you need to balance your disk IO across more physical
devices and/or more channels. await = 500 means disk physical IO
requests are taking an average of 500mS (0.5 seconds) to satisfy. If
many processes are waiting for disk IO, you'll see high load factors
even though CPU usage is fairly low.
iostat is in yum package systat (not installed by default in most
configs), vmstat is in procps (generally installed by default). on
both of these commands, ignore the first output, thats the system
average since reboot, generally meaningless. the 2nd and successive
outputs are at the intervals specified (5 seconds in my above examples).
On our database servers, which experience very high disk IO loads, we
often use 4 separate RAIDs... / and the other normal system volumes are
partitions on a raid1 (typically 2 x 36GB 15k scsi or sas), then the
database itself will be spread across 3 volumes /u10 /u11 /u12, which
are each RAID 1+0 built from 4 x 72GB 15k scsi/sas or FC SAN volumes.
We'll always use RAID controllers with hardware battery-protected raid
write-back cache for the database volumes, as this hugely accelerates
'commits'. Note, we don't use mysql, I have no idea if its capable of
taking advantage of configurations like this, but postgresql and oracle
certainly are. The database adminstrators will spend hours pouring
over IO logs and database statistics in order to better optimize the
distribution of tables and indicies across the available tablespaces.
Under these sorts of heavy concurrent random access patterns, SATA and
software RAID just don't cut it, regardless of how good its sequential
benchmarks may be.
Please CC me on replies, as I am only getting the digest.
spamtrap@xxxxxxxxxxxx ??!? no thanks.
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos