Roger Lucas wrote:
Roger Lucas wrote:
What drive configuration are you using (SCSI / ATA / SATA), what
chipset
is
providing the disk interface and what cpu are you running with?
3xSATA, Seagate 320 ST3320620AS, Intel 6600, ICH7 controller using the
ata-piix driver, with drive cache set to write-back. It's not obvious
to
me why that matters, but if it helps you see the problem I''m glad to
provide the info. I'm seeing ~50MB/s on the raw drive, and 3x that on
plain stripes, so I'm assuming that either the RAID-5 code is not
working well or I haven't set it up optimally.
If it had been ATA, and you had two drives as master+slave on the same
cable, then they would be fast individually but slow as a pair.
RAID-5 is higher overhead than RAID-0/RAID-1 so if your CPU was slow
then
you would see some degradation from that too.
We have similar hardware here so I'll run some tests here and see what I
get...
Much appreciated. Since my last note I tried adding --bitmap=internal to
the array. Bot is that a write performance killer. I will have the chart
updated in a minute, but write dropped to ~15MB/s with bitmap. Since
Fedora can't seem to shut the last array down cleanly, I get a rebuild
on every boot :-( So the array for the LVM has bitmap on, as I hate to
rebuild 1.5TB regularly. Have to do some compromises on that!
Hi Bill,
Here are the results of my tests here:
CPU: Intel Celetron 2.7GHz socket 775
MB: Abit LG-81 (Lakeport ICH7 chipset)
HDD: 4 x Seagate SATA ST3160812AS (directly connected to ICH7)
OS: Linux 2.6.16-xen
root@hydra:~# uname -a
Linux hydra 2.6.16-xen #1 SMP Thu Apr 13 18:46:07 BST 2006 i686 GNU/Linux
root@hydra:~#
All four disks are built into a RAID-5 array to provide ~420GB real storage.
Most of this is then used by the other Xen virtual machines but there is a
bit of space left on this server to play with in the Dom-0.
I wasn't able to run I/O tests with "dd" on the disks themselves as I don't
have a spare partition to corrupt, but hdparm gives:
root@hydra:~# hdparm -tT /dev/sda
/dev/sda:
Timing cached reads: 3296 MB in 2.00 seconds = 1648.48 MB/sec
Timing buffered disk reads: 180 MB in 3.01 seconds = 59.78 MB/sec
root@hydra:~#
Which is exactly what I would expect as this is the performance limit of the
disk. We have a lot of ICH7/ICH7R-based servers here and all can run the
disk at their maximum physical speed without problems.
root@hydra:~# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sda2[0] sdd2[3] sdc2[2] sdb2[1]
468647808 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
unused devices: <none>
root@hydra:~# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/bigraid-root
10G 1.3G 8.8G 13% /
<snip>
root@hydra:~# vgs
VG #PV #LV #SN Attr VSize VFree
bigraid 1 13 0 wz--n- 446.93G 11.31G
root@hydra:~# lvcreate --name testspeed --size 2G bigraid
Logical volume "testspeed" created
root@hydra:~#
*** Now for the LVM over RAID-5 read/write tests ***
root@hydra:~# sync; time bash -c "dd if=/dev/zero bs=1024k count=2048
of=/dev/bigraid/testspeed; sync"
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 33.7345 seconds, 63.7 MB/s
real 0m34.211s
user 0m0.020s
sys 0m2.970s
root@hydra:~# sync; time bash -c "dd of=/dev/zero bs=1024k count=2048
if=/dev/bigraid/testspeed; sync"
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 38.1175 seconds, 56.3 MB/s
real 0m38.637s
user 0m0.010s
sys 0m3.260s
root@hydra:~#
During the above two tests, the CPU showed about 35% idle using "top".
*** Now for the file system read/write tests ***
(Reiser over LVM over RAID-5)
root@hydra:~# mount
/dev/mapper/bigraid-root on / type reiserfs (rw)
<snip>
root@hydra:~#
root@hydra:~# sync; time bash -c "dd if=/dev/zero bs=1024k count=2048
of=~/testspeed; sync"
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 29.8863 seconds, 71.9 MB/s
real 0m32.289s
user 0m0.000s
sys 0m4.440s
root@hydra:~# sync; time bash -c "dd of=/dev/null bs=1024k count=2048
if=~/testspeed; sync"
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 40.332 seconds, 53.2 MB/s
real 0m40.973s
user 0m0.010s
sys 0m2.640s
root@hydra:~#
During the above two tests, the CPU showed between 0% and 30% idle using
"top".
Just for curiousity, I started the RAID-5 check process to see what load it
generated...
root@hydra:~# cat /sys/block/md0/md/mismatch_cnt
0
root@hydra:~# echo check > /sys/block/md0/md/sync_action
root@hydra:~# cat /sys/block/md0/md/sync_action
check
root@hydra:~# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sda2[0] sdd2[3] sdc2[2] sdb2[1]
468647808 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
[>....................] resync = 1.0% (1671552/156215936)
finish=101.8min speed=25292K/sec
unused devices: <none>
root@hydra:~#
Whilst the above test was running, the CPU load was between 3% and 7%, so
running the RAID array isn't that hard for it...
-------------------------
So, using a 4-disk RAID-5 array with an ICH7, I get about 64M write and 54MB
read prformance. The processor is about 35% idle whilst the test is running
- I'm not sure why this is, I would have expected the processor load to be
0% idle as it should be hitting the hard disk as fast as possible and
waiting for it otherwise....
If I run over Reiser, the processor load changes a lot more, varying between
0% and 35% idle. It also takes a couple of seconds after the test has
finished before the load drops down to zero on the write test, so I suspect
these results are basically the same as the raw LVM-over-RAID5 performance.
Summary - it is a little faster with 4 disks rather than the 37.5 MB/s that
you have with just the three, but it is WAY off the theoretical target of
3x60MB = 180MB that could be expected given that you are running a 4-disk
RAID-5 array.
On the flip side, the performance is good enough for me, so it is not
causing me a problem, but it seems that there should be a performance boost
available somewhere!
Best regards,
Roger
Thank you so much for verifying this. I do keep enough room on my drives
to run tests by creating any kind of whatever I need, but the point is
clear: with N drives striped the transfer rate is N x base rate of one
drive; with RAID-5 it is about the speed of one drive, suggesting that
the md code serializes writes.
If true, BOO, HISS!
Can you explain and educate us, Neal? This look like terrible performance.
--
Bill Davidsen
He was a full-time professional cat, not some moonlighting
ferret or weasel. He knew about these things.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html