Re: mdadm degraded RAID5 failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday November 4, jeeping@xxxxxxxxx wrote:
> Hi Neil and others,
> 
> Just a couple of questions, I know you're busy -
> 
> Do you recommend that I attempt to upgrade mdadm to a more recent
> version before any other recovery attempts? If so, which version?

Yes.  2.6.7.1 (the latest).

> 
> I noted my replacement drive (sdc1) got a smart error (during the
> rebuild?), would you recommend replacing it or removing it altogether
> until I get the other 2 drives back online (if I even can)?

There seem to be different opinion on how much weight to put on SMART
errors.   So I make no recommendations based on them.

> 
> Is there a way to correct the drive names -

When you assemble the array again, it will update the device names to
what they are at the time.

As you have 2 devices that think they are 'spare', you won't be able
to assemble a working array using "--assemble".

What you will need to do is recreate the array over just two devices
and make sure you get them in the right order.

The one that claims to be device '2' (sdb1 below) certainly is device
2 (i.e. the last device: they are numbered 0,1,2).  The others I can
not be so sure of.

So I would recreate the array with e.g.

  mdadm -C /dev/md0 -l5 -n2 /dev/sdc1 missing /dev/sdb1

And check it with e.g.
  fsck -n -f /dev/md0

If fsck is happy: good.  If not, try again with a different
arrangement:

   mdadm -C /dev/md0 -l5 -n2 missing /dev/sdc1 /dev/sdb1

etc.  I don't know which of c1 and d1 is more likely to have good
data.   Keep going until you get a good 'fsck'.

Make very sure to use the "-n" option to fsck to ensure it doesn't try
to 'fix' the mess it finds.

Also, before doing the above, run "mdadm --examine /dev/sdb1" and keep
a record of that.  Check the 'chunksize'.  If it isn't 64, you will
need to explicitly give then number to "mdadm -C".  Also check the
layout and possibly set that explicitly when doing "mdadm -C".

good luck.

NeilBrown


> 
> > /dev/sdb1:
> > this     2       8       49        2      active sync   /dev/sdd1
> 
> 
> > /dev/sdc1:
> > this     3       8       33        3      spare   /dev/sdc1
> 
> 
> > /dev/sdd1:
> > this     3       8       33        3      spare   /dev/sdc1
> 
> I'm inclined to believe (but am not sure at all) that -
> 
> sdb1 should be sdd1
> sdc1 is correct
> sdd1 should be sdb1
> 
> Thanks!
> Steve..
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux