Re: need a little help rebuilding a raid 10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Greg,

On 12/06/2011 09:11 AM, Greg Freemyer wrote:
> Hmm...
> 
> My rebuild failed.  At first glance I had both a failed drive and a failed slot?
> 
> What I don't understand is I have I/O errors in /var/log/messages from
> when the rebuild failed over night.

Something in your system is untrustworthy.

> But this morning, hdparm --read-sector is reading the "bad" sectors fine.

What does smartctl say about your drives (all of them)?

> I already tried replacing the drive and the replacement drive also
> reported media errors during the rebuild, that's why I came to believe
> I had a bad slot.
> 
> Now I have non-repeatable media errors.
> 
> fyi: I have the problem drive connected via eSata now, so it's a
> different controller totally than where it was when the failure first
> occurred.

Are the errors in /var/log/messages only from that drive?  If so, then that
drive is probably toast.

> Any thoughts?

Your prior e-mail said that you re-created the array.  I didn't see that you
had definitively nailed down the problem at that point, so it probably wasn't
a good idea.  In particular, it destroys all prior metadata on the array
members.  If you didn't keep the output of "mdadm -E" for each drive, that
information is now lost.

In general, "--create" is a last resort, and only to be used for recovery
when you have absolute confidence you understand the layout (mdadm -E
printouts of the original array).  "--assemble --force" is the proper step
after "--assemble" fails.

I would completely scrub the questionable drive with random data, run a long
smartctl test on it, and replace it if it reports any re-allocated sectors at
that point.

I would also run long smartctl tests on the other drives, looking for pending
sectors or re-allocated sectors.  If any, I would plan on replacements for
them as well, and would try to validate the content of your files.  You do
have a backup to compare against, after all.

If you are running a Debian-based distro, and the array contains your rootfs,
you might find "debsums" useful.

HTH,

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux