RE: Spurious HD convictions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hmm.  I don't see how it could be either the PS or the PMs, since the drives 
were moved to a new enclosure when the problem started happening, yet the 
problem persists.  The new chassis has all new PMs and of course a new PS, 
and the problem is happening across multiple PMs.  In addition, if NCQ is the 
problem, why has it just started happening?  This system has been up and 
running for the better part of a year.  Regardless, I have disabled NCQ by 
executing `echo 1 > /sys/block/sd[a-g]/device/queue_depth`, and I am 
attempting a repair action again.  We'll see how it goes.

> Hi Leslie,
> 
> According to some of the links here:
> http://www.google.com/search?hl=en&q=failed+to+read+SCR+1+(Emask%3D0x40)
> 
> It seem to be either the Power Supply Unit (PSU) or the Port Multiplier
> (PM).
> 
> A quick workaround seem to be disabling NCQ on all affected devices.
> 
> On Sun, Dec 13, 2009 at 5:02 AM, lrhorer@xxxxxxxxxxx
> <lrhorer@xxxxxxxxxxx> wrote:
> >
> >        What's happening here?  Suddenly, my backup server is suffering
> apparently
> > spurious hard drive convictions.  The server is running RAID5 on 7 disks
> > under md.  It has been running well for months, but suddenly it has
> started
> > kicking drives from the array when under moderately heavy read or write
> > loads.  The thing is, it isn't convicting any particular drive
> repeatedly,
> > and the drives are not showing any errors under SMART.  This is a PM
> system,
> > and I have tried changing the drive adapters, changing the PMs, changing
> > cables, moving the drives around, and moving them out of the CPU
> enclosure to
> > a new external chassis.  The convictions are not occurring on any one
> > channel, over any one particular PM, or over any particular cable.
>  Since
> > this started happening, I have been unable to get all the way through a
> > resync before the array dumps at least one of the drives.  Here is a
> sample
> > from the kernel log during one of the convictions:
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux