Hi Leslie, I was wondering if you were able to stop the weird behavior with your disks. On Sun, Dec 13, 2009 at 6:44 AM, lrhorer@xxxxxxxxxxx <lrhorer@xxxxxxxxxxx> wrote: > Hmm. I don't see how it could be either the PS or the PMs, since the drives > were moved to a new enclosure when the problem started happening, yet the > problem persists. The new chassis has all new PMs and of course a new PS, > and the problem is happening across multiple PMs. In addition, if NCQ is the > problem, why has it just started happening? This system has been up and > running for the better part of a year. Regardless, I have disabled NCQ by > executing `echo 1 > /sys/block/sd[a-g]/device/queue_depth`, and I am > attempting a repair action again. We'll see how it goes. > >> Hi Leslie, >> >> According to some of the links here: >> http://www.google.com/search?hl=en&q=failed+to+read+SCR+1+(Emask%3D0x40) >> >> It seem to be either the Power Supply Unit (PSU) or the Port Multiplier >> (PM). >> >> A quick workaround seem to be disabling NCQ on all affected devices. >> >> On Sun, Dec 13, 2009 at 5:02 AM, lrhorer@xxxxxxxxxxx >> <lrhorer@xxxxxxxxxxx> wrote: >> > >> > What's happening here? Suddenly, my backup server is suffering >> apparently >> > spurious hard drive convictions. The server is running RAID5 on 7 disks >> > under md. It has been running well for months, but suddenly it has >> started >> > kicking drives from the array when under moderately heavy read or write >> > loads. The thing is, it isn't convicting any particular drive >> repeatedly, >> > and the drives are not showing any errors under SMART. This is a PM >> system, >> > and I have tried changing the drive adapters, changing the PMs, changing >> > cables, moving the drives around, and moving them out of the CPU >> enclosure to >> > a new external chassis. The convictions are not occurring on any one >> > channel, over any one particular PM, or over any particular cable. >> Since >> > this started happening, I have been unable to get all the way through a >> > resync before the array dumps at least one of the drives. Here is a >> sample >> > from the kernel log during one of the convictions: > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Majed B. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html