Re: Seeking advice on Loop-AES performance with RAID5/RAID6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



George Koss <george_koss <at> hotmail.com> writes:
> 
> On my penultimate file server, I used 5 Seagate 250G Sata drives with an AMD 
> 3800+ AM2 CPU, and this got fairly (I thought) decent performance.  I did a 
> fair amount of benchmarking, and got 300 MB/s writes and 200MB/s reads with 
> a 5 disk raid0 and XFS.  Moving to RAID6 and Reiser3.6 dropped the 
> performance to 71 MB/s read and 69 MB/s write.  (All benchmarks with 
> Bonnie++ 1.03 using 4G data chunks).

I have assembled a couple of low-cost software RAID SATA Linux fileservers over 
the past 2-3 years. In my experience the most important factors determining the 
the max achievable read/write filesystem throughput is the SATA chipset used, 
and the bus it is sitting on (PCI, PCI-X, PCI-E, or directly integrated to the 
motherboard chipset, etc).

The perf numbers you cite over RAID0 are good (200-300 MB/s). Regarding RAID6 I 
can't comment since I have never used it, but if using RAID5 you should see 
much better numbers. For reference purpose one of the servers I have here is an 
old dual Opteron 244 (1.8 GHz), with two 4-port sii3124 (sata_sil24.ko) 
controllers on a 100 MHz 64-bit PCI-X bus with five SATA 1.5-Gbps Seagate 250 
GB drives, RAID5+jfs on top of all of them, and kernel 2.6.16.4.

I see 100-120 MB/s writes and 215-220 MB/s reads (just use dd and vmstat for 
benchmarking, no need to resort to Bonnie++).

You should see similar RAID5 numbers on your penultimate file server (since you 
get 200-300 MB/s with RAID0). My intuition tells me that RAID6 is responsible 
for the poor throughput (easy to determine with "vmstat 1" and "iostat -x 1").

> Now I've built a totally new machine using an ASUS M2N-E motherboard with 
> the NForce 570 Ultra chip set.  It has 2 GBytes of PC5300 ECC DDR2 memory, 
> and six Seagate 400 GB SATA2 drives.  The CPU is a AMD 4200+ X2 dual core 
> AM2 with dual 512K L2 caches.
> [...]
> But it all went to hell when I tried the full stack on the six drive RAID5 
> or RAID6 array.  Performance for writing fell to a pitiful 15 Mbytes/sec.   
> [...]
> The best write speed I've gotten so far is 17 
> Mbyte/s with RAID6 and 64K chunksizes on a reiser3.6 filesystem.
> RAID5 performance is terrible when I stack loop-AES on top of it.   Without 
> loop-AES, I'm getting 99 Mbytes/sec Read, and 105 Mbytes Write with RAID5.   
> This is the performance level I want to hit, since it's just about right 
> when transferring data with Gigabit ethernet.

First, reading at 100 MB/s on a 6-drive SATA2 RAID5 is not what I would call 
great performances. There is no need to benchmark loop-AES on top of that, you 
know there is already a pb with RAID performances.

> My next step is to install Ubuntu AMD64 server on a spare partition and see 
> if getting rid of X and KDE will help at all.  I've got about 121 tasks 
> running with Kubuntu desktop, and only 64 with the Ubuntu server on the 
> other machine.

If these 121 tasks are mostly idle, they shouldn't impact disk throughput in 
any way whatsoever.

> Anybody got any ideas on how to find the problem?  I suspect that if I shell 
> out $500 for an ARC-1220 raid controller, I can get the performance back up 
> to 50 or 60 Mbytes/sec, but this seems to defeat the purpose of building a 
> low cost Linux file server.

My suggestion is to benchmark step by step every layer of the storage subsystem 
with dd, to know exactly which one exhibits a perf pb. Each time, run "vmstat 
1" and "iostat -x 1" in parallel and post the output of these commands to the 
ML.

1. Use "dd bs=X" (X between 4k and 64k) to directly read from and write to a 
single partition of one of the drives, eg:
  $ dd if=/dev/zero of=/dev/sda2 bs=8k
  $ dd if=/dev/sda2 of=/dev/null bs=8k

2. Do the same thing with multiple instances of dd running in parallel to 
read/write 2, 3..., N disks at the same time.

3. This time create the RAID array and run dd on top of it, eg:
  $ dd if=/dev/zero of=/dev/md0 bs=8k
  $ dd if=/dev/md0 of=/dev/null bs=8k

4. Create a filesystem on top of the array, and run dd to write and read a 
file, eg:
  $ dd if=/dev/zero of=/mnt/foo bs=8k
  $ dd if=/mnt/foo of=/dev/null bs=8k

Each time, experience with various bs=X values, and report the output of 
vmstat/iostat.

My feeling is that it is likely that the SATA controller built in the Nforce 
570 chipset is not very efficient.

IMHO the best controllers you can buy today with 4 ports or more are the 
AHCI-based ones and the sii3124. Cheap (less than $70, or integrated into 
motherboards), as performant as the most expensive controllers (always able to 
handle the max read/write throughput with 1 disk on every port), NCQ support, 
hotplug support, and production-quality Linux drivers.

-marc


-
Linux-crypto:  cryptography in and on the Linux system
Archive:       http://mail.nl.linux.org/linux-crypto/


[Index of Archives]     [Kernel]     [Linux Crypto]     [Gnu Crypto]     [Gnu Classpath]     [Netfilter]     [Bugtraq]
  Powered by Linux