Re: Incredibly poor performance of mdraid-1 with 2 SSD Samsung 840 PRO

Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> · Mon, 22 Apr 2013 21:51:06 -0500

On 4/22/2013 5:19 AM, Andrei Banu wrote:
> Hello!
> 
> First off allow me to apologize if my rumbling sent you in a wrong
> direction and thank you for assisting.

No harm done, and you're welcome.

> The actual problem is that when I write any larger file hundreds of MB
> or more to the server (from network or from the same server) the server
> starts to overload. The server can overload to over 100 for files of ~
> 5GB. I mean this server has an average load of 0.52 (sar -q) but it can
> spike to 3 digit server loads in a few minutes from making or
> downloading a larger cPanel backup file. I have to rely only on R1Soft
> for backups right now because the normal cPanel backups make the server
> unstable when it backs up accounts over 1GB (many).

Describing this problem in terms of load average isn't very helpful.
What would be is 'perf top -U' output so we can see what is eating cpu,
simultaneously with 'iotop' so we see what's eating IO.

> So I concluded this is due to very low write speeds so I ran the 'dd'

It's most likely that the low disk throughput is a symptom of the
problem, which is lurking elsewhere awaiting discovery.

> 1. Some said the low write speed might be due to a bad cable. 

Very unlikely, but possible.  This is easy to verify.  Does dmesg show
hundreds of "hard resetting link" messages.

> 2. I have observed a very big difference between /dev/sda and /dev/sdb
> and I thought it might me indicative of a problem somewhere. If I run
> hdparm -t /dev/sda I get about 215MB/s but on /dev/sdb I get about
> 80-90MB/s. Only if I add --direct flag I get 260MB/s for /dev/sda.
> Previously when I added --direct for /dev/sdb I was getting about
> 180MB/s but now I get ~85MB/s with or without --direct.

I simply chalked up the difference to IO load variance between test runs
of hdparm.  If one SSD is always that much slower there may be a problem
with the drive or controller but it's not likely.  If you haven't
already, swap the cable on the slow drive with new one.  In fact, SATA
cables are cheap as dirt so I'd swap them both just for piece of mind.

> root [/]# hdparm -t /dev/sdb
> Timing buffered disk reads:  262 MB in  3.01 seconds =  86.92 MB/sec
> 
> root [/]# hdparm --direct -t /dev/sdb
> Timing O_DIRECT disk reads:  264 MB in  3.08 seconds =  85.74 MB/sec
...
> This is something new. /dev/sdb no longer gets to nearly 200MB/s (with
> --direct) but stays under 100MB/s in all cases. Maybe indeed it's a
> problem with the cable or with the device itself.
...
> And a 30 minutes later update: /dev/sdb returned to 90MB/s read speed
> WITHOUT --direct and 180MB/s WITH --direct. /dev/sda is constant (215
> without --direct and 260 with --direct). What do you make of this?

Show your partition tables again.  My gut instinct tells me you have a
swap partition on /dev/sdb, and/or some other partition that is not part
of the RAID1, nor equally present on /dev/sda, that is/are being
accessed heavily at some times and not others, thus the throughput
discrepancy.

If this is the case, and the kernel is low on RAM due to an application
memory leak or just normal process load, that swap partition may become
critical.  When when you start $big_file copy, the kernel goes into
overdrive swapping and/or dropping cache to make room for $big_file in
the write buffers.  This could explain both your triple digit system
load and the decreased throughput on /dev/sdb.

The fdisk output you provided previously showed only 3 partitions per
SSD, all RAID autodetect, all in md/RAID1 I assume.  However, the
symptoms you're reporting tend to suggest the partition layout I just
described, and could be responsible for the odd up/down throughput on sdb.

-- 
Stan

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html