Re: How to stress test an RAID 6 array?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/3/2011 8:58 AM, Marcin M. Jessa wrote:

>  exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen

This line is not important ^^^

>  ata9.00: failed command: FLUSH CACHE EXT

THIS one is:^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> That "exception Emask" part pointed me to misc threads where people
> mentioned bugs in the Linux kernel.

According to your dmesg output the kernel believes the drives are not
completing the ATA6 (and later) FLUSH_CACHE_EXT command.  hdparm will
confirm your drives drives do support it.  FLUSH_CACHE_EXT is sent to a
drive to force data in the cache to hit the platters.  This is done for
data consistency and to prevent filesystem corruption due to power
outages, system crashes, and the like.

What you need to figure out is why the apparent flush command faliures
are occurring.  The cause will likely be a kernel/driver issue, a
motherboard/sata controller issue, a PSU issue, or a drive issue.

The few instances of this FLUSH_CACHE_EXT error I located seemed to
center somewhere around kernel 2.6.34.  IIRC those experiencing this
issue on FC and Ubuntu instantly fixed it with a distro upgrade.

Thus, upgrade your kernel to 2.6.38.8 or later.  If that doesn't fix it,
disable the write caches on your array member drives (a very good idea
with non BBU RAID anyway).  The proper/preferred way to do this may vary
amongst distros.  Adding a boot script containing something like the
following to the appropriate /etc/rc.x directory should do the trick on
all distros:

#!/bin/sh
hdparm -W0 /dev/sda
hdparm -W0 /dev/sdb
hdparm -W0 /dev/sdc
hdparm -W0 /dev/sdd
hdparm -W0 /dev/sde

Reboot.  Confirm the write caches are disabled with something like this:

#!/bin/bash
for i in {a..e}
do
    echo -n "sd$i:  "
    hdparm -i /dev/sd$i|grep -i writecache|awk '{ print $2 }'
done

If neither of these suggestions fixes the problem then you may need to
start replacing or adding hardware.  At that point I'd recommend
dropping an LSI SAS 9211-8i into your free PCIe x16 slot.

-- 
Stan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux