Re: 4 out of 16 drives show up as 'removed'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 8 Dec 2011 13:42:44 -0800 Eli Morris <ermorris@xxxxxxxx> wrote:

> 
> On Dec 8, 2011, at 12:59 PM, NeilBrown wrote:
> 
> > On Thu, 8 Dec 2011 12:39:10 -0800 Eli Morris <ermorris@xxxxxxxx> wrote:
> > 
> >> 
> > 
> >> 
> >> and here is the verbose assemble output:
> >> 
> >> [root@stratus log]# mdadm --verbose --assemble /dev/md5 --force /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 /dev/sdj1 /dev/sdk1 /dev/sdl1 /dev/sdm1 /dev/sdn1 /dev/sdo1 
> >> mdadm: looking for devices for /dev/md5
> >> mdadm: /dev/sda1 is identified as a member of /dev/md5, slot 0.
> >> mdadm: /dev/sdb1 is identified as a member of /dev/md5, slot -1.
> >> mdadm: /dev/sdc1 is identified as a member of /dev/md5, slot 2.
> >> mdadm: /dev/sdd1 is identified as a member of /dev/md5, slot 3.
> >> mdadm: /dev/sde1 is identified as a member of /dev/md5, slot 4.
> >> mdadm: /dev/sdf1 is identified as a member of /dev/md5, slot 5.
> >> mdadm: /dev/sdg1 is identified as a member of /dev/md5, slot 6.
> >> mdadm: /dev/sdh1 is identified as a member of /dev/md5, slot 7.
> >> mdadm: /dev/sdi1 is identified as a member of /dev/md5, slot -1.
> >> mdadm: /dev/sdj1 is identified as a member of /dev/md5, slot 9.
> >> mdadm: /dev/sdk1 is identified as a member of /dev/md5, slot 10.
> >> mdadm: /dev/sdl1 is identified as a member of /dev/md5, slot 11.
> >> mdadm: /dev/sdm1 is identified as a member of /dev/md5, slot 12.
> >> mdadm: /dev/sdn1 is identified as a member of /dev/md5, slot 13.
> >> mdadm: /dev/sdo1 is identified as a member of /dev/md5, slot -1.
> >> mdadm: no uptodate device for slot 1 of /dev/md5
> >> mdadm: added /dev/sdc1 to /dev/md5 as 2
> >> mdadm: added /dev/sdd1 to /dev/md5 as 3
> >> mdadm: added /dev/sde1 to /dev/md5 as 4
> >> mdadm: added /dev/sdf1 to /dev/md5 as 5
> >> mdadm: added /dev/sdg1 to /dev/md5 as 6
> >> mdadm: added /dev/sdh1 to /dev/md5 as 7
> >> mdadm: no uptodate device for slot 8 of /dev/md5
> >> mdadm: added /dev/sdj1 to /dev/md5 as 9
> >> mdadm: added /dev/sdk1 to /dev/md5 as 10
> >> mdadm: added /dev/sdl1 to /dev/md5 as 11
> >> mdadm: added /dev/sdm1 to /dev/md5 as 12
> >> mdadm: added /dev/sdn1 to /dev/md5 as 13
> >> mdadm: no uptodate device for slot 14 of /dev/md5
> >> mdadm: no uptodate device for slot 15 of /dev/md5
> >> mdadm: added /dev/sdb1 to /dev/md5 as -1
> >> mdadm: added /dev/sdi1 to /dev/md5 as -1
> >> mdadm: failed to add /dev/sdo1 to /dev/md5: Device or resource busy
> >> mdadm: added /dev/sda1 to /dev/md5 as 0
> >> mdadm: /dev/md5 assembled from 12 drives and 2 spares - not enough to start the array.
> >> 
> >> 
> > 
> > Thank.
> > 
> > I know what the 'busy' thing is now.
> > sdo1 appears the be the 'same' as some other device in some way.
> > 
> > Also it looks like you might have turned some drives into spares
> > unintentionally, though I'm not sure
> > 
> > Could you pleas send "mdadm --examine" output for all of these drives and
> > I'll have a look.
> > 
> > Thanks,
> > NeilBrown
> > 
> > 
> > 
> 
> Thanks Neil. I wasn't sure if you wanted the output of all the drives or just the 'removed' ones, so here is the output for all the drives in the array.
> 
> Just FYI, I don't know what I could have done to make these spares. Between when things worked fine and when they did not, I did not make any hardware or configuration changes to the array.
> 

Thanks.  I did want it all (it is always better to give too much than to
little - so thanks).

Those devices have be turned into spares.  Maybe an "--add" command or
possibly even a "--re-add" though it shouldn't.  Newer versions of mdadm are
more careful about this.

You need to re-"Create" the array.  This doesn't affect the data, just writes
new metadata.
It looks like it is safe to assume that none of the devices have been
renamed.  However if you have any reason to believe that the devices don't
belong in the array in the 'obvious' order, you should let me know or adjust
the command below accordingly.

You want to create the array exactly as it was, and you want to make sure
it doesn't immediately start to resync, just in case something goes wrong and
we want to try again.

All the 'Data Offset's are the same and are 2048 (1M) which is the current
default so that is good.

So:
  mdadm --create /dev/md5 -l5 --layout=left-symmetric --chunk=512 \
  --raid-disks=16  --assume-clean /dev/sd[a-p]

This will over-write all the metadata but not touch the data.

Then you probably want to
  fsck -n /dev/md5

to make sure it looks good.  If it does,

 echo check > /sys/block/md5/md/sync_action

That will read all blocks and  make sure parity is correct.  When it finishes
check
   /sys/block/md5/md/mismatch_cnt

if this is zero or close to zero, then it is looking very good.
If it is a lot more than zero (as  > 10000) then we probably need to think
again.
If it is small but non-zero, then "echo repair > ...the same /sync_action"
will fix it up.

If fsck showed any issues, run
  fsck -f /dev/md5
to fix them, then mount the filesystem and all should be good.

What version of mdadm do you have?

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux