On 03/30/2011 04:08 AM, Louis-David Mitterrand wrote:
Hi,
I am seeing horrific performance on a Dell T610 with a LSISAS2008 (Dell
H200) card and 8 WD1002FAEX Caviar Black 1TB configured in mdadm raid6.
The LSI card is upgraded to the latest 9.00 firmware:
http://www.lsi.com/storage_home/products_home/host_bus_adapters/sas_hbas/internal/sas9211-8i/index.html
and the 2.6.38.2 kernel uses the newer mpt2sas driver.
On the T610 this command takes 20 minutes:
tar -I pbzip2 -xvf linux-2.6.37.tar.bz2 22.64s user 3.34s system 2% cpu 20:00.69 total
Get rid of the "v" option. And do an
sync
echo 3 > /proc/sys/vm/drop_caches
before the test. Make sure your file system is local, and not NFS
mounted (this could easily explain the timing BTW).
While we are at it, don't use pbzip2, use single threaded bzip2, as
there may be other platform differences that impact the parallel extraction.
Here is an extraction on a local md based Delta-V unit (we use
internally for backups)
[root@vault t]# /usr/bin/time tar -xf ~/linux-2.6.38.tar.bz2
25.18user 4.08system 1:06.96elapsed 43%CPU (0avgtext+0avgdata
16256maxresident)k
6568inputs+969880outputs (4major+1437minor)pagefaults 0swaps
This also uses an LSI card.
On one of internal file servers using a hardware RAID
root@crunch:/data/kernel/2.6.38# /usr/bin/time tar -xf linux-2.6.38.tar.bz2
22.51user 3.73system 0:22.59elapsed 116%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+969872outputs (0major+3565minor)pagefaults 0swaps
Try a similar test on your two units, without the "v" option. Then try
to get useful information about the MD raid, and file system atop this.
For our MD raid Delta-V system
[root@vault t]# mdadm --detail /dev/md2
/dev/md2:
Version : 1.2
Creation Time : Mon Nov 1 10:38:35 2010
Raid Level : raid6
Array Size : 10666968576 (10172.81 GiB 10922.98 GB)
Used Dev Size : 969724416 (924.80 GiB 993.00 GB)
Raid Devices : 13
Total Devices : 14
Persistence : Superblock is persistent
Update Time : Wed Mar 30 04:46:35 2011
State : clean
Active Devices : 13
Working Devices : 14
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Name : 2
UUID : 45ddd631:efd08494:8cd4ff1a:0695567b
Events : 18280
Number Major Minor RaidDevice State
0 8 35 0 active sync /dev/sdc3
13 8 227 1 active sync /dev/sdo3
2 8 51 2 active sync /dev/sdd3
3 8 67 3 active sync /dev/sde3
4 8 83 4 active sync /dev/sdf3
5 8 99 5 active sync /dev/sdg3
6 8 115 6 active sync /dev/sdh3
7 8 131 7 active sync /dev/sdi3
8 8 147 8 active sync /dev/sdj3
9 8 163 9 active sync /dev/sdk3
10 8 179 10 active sync /dev/sdl3
11 8 195 11 active sync /dev/sdm3
12 8 211 12 active sync /dev/sdn3
14 8 243 - spare /dev/sdp3
[root@vault t]# mount | grep md2
/dev/md2 on /backup type xfs (rw)
[root@vault t]# grep md2 /etc/fstab
/dev/md2 /backup xfs defaults 1 2
And a basic speed check on the md device
[root@vault t]# dd if=/dev/md2 of=/dev/null bs=32k count=32000
32000+0 records in
32000+0 records out
1048576000 bytes (1.0 GB) copied, 3.08236 seconds, 340 MB/s
[root@vault t]# dd if=/dev/zero of=/backup/t/big.file bs=32k count=32000
32000+0 records in
32000+0 records out
1048576000 bytes (1.0 GB) copied, 2.87177 seconds, 365 MB/s
Some 'lspci -vvv' output, and contents of /proc/interrupts,
/proc/cpuinfo, ... would be helpful.
where on a lower spec'ed Poweredge 2900 III server (LSI Logic MegaRAID
SAS 1078 + 8 x Hitachi Ultrastar 7K1000 in mdadm raid6) it takes 22
_seconds_:
tar -I pbzip2 -xvf linux-2.6.37.tar.bz2 16.40s user 3.22s system 86% cpu 22.773 total
Besides hardware, the other difference between servers is that the
PE2900's MegaRAID has no JBOD mode so each disk must be configured as a
"raid0" vdisk unit. On the T610 no configuration was necessary for the
disks to "appear" in the OS. Would configuring them as raid0 vdisks
change anything?
Thanks in advance for any suggestion,
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman@xxxxxxxxxxxxxxxxxxxxxxx
web : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html