Re: recovering from a controller failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, May 30, 2010 at 09:18:21AM +1200, Richard wrote:

> This happened to me before I discovered that LSI SAS1068E no longer
> reliably tolerate querying via smartd/smartctl.
> 
> Have a look at https://bugzilla.kernel.org/show_bug.cgi?id=14831
> 
> and there is a patch that seems to fix it here:
> 
> http://lkml.org/lkml/2010/4/26/335

Good news!  I appreciate the information.  I'm planning to update these
machines with new kernels and will include this patch.

> Use hdparm if you need serial numbers.

The labels Sun puts on the drives has numbers from the "device model."
I will see if hdparm yields those numbers...once this is all settled. 
Thanks for the suggestion.

> In the the half dozen or so tests I have done, where more than 2
> drives have been thrown out of md RAID6 arrays due to these
> controller resets,
> reassembly using --force has worked with no data corruption, but
> this may have been good luck.

Wow!  That's encouraging.  I would feel amazingly more confident if
someone would give me the exact command to try.  This is not a good
time for me to exercise my ignorance by experimenting.

Thank you for your helpful insight!

--kyler
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux