Re: hung grow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/10/17 20:09, Curt wrote:
Ok, thanks.

I'm pretty sure I'll be able to DD from at least one of the failed
drives, as I could still query them before I yanked them.  Assuming I
can DD one of the old drives to one of my new ones.

I'd DDrescue old to new drive. Then do an assemble for force, with a
mix of the dd drives and my old good ones? So if sda/b are new DD'd
drives and sdc/d/e are hosed grow drives, I'd do an assemble force
revert-reshape /dev/md127 sda sdb sdc sdd and sde? Then assemble can
use my info from the DD drives to assemble the array back to 7 drives?
  Did I understand that right?

This sounds like you need to take a great big step backwards, and make sure you understand EXACTLY what is going on. We have a mix of good drives, copies of bad drives, and an array that doesn't know whether it is supposed to have 7 or 9 drives. One wrong step and your array will be toast.

You want ALL FOUR KNOWN GOOD DRIVES. You want JUST ONE ddrescue'd drive.

But I think the first thing we need to do, is to wait for an expert like Phil to chime in and sort out that reshape. Your four good drives all think they are part of a 9-drive array. Your first two drives to fail think they are part of a 7-drive array. Does the third drive think it's part of a 7-drive or 9-drive array?

Can you do a --examine on this drive? I suspect the grow blew up because it couldn't access this drive. I this drive thinks it is part of a 7-drive array, we have a bit of a problem on our hands.

I'm hoping it thinks it's part of a 9-drive array - I think we may be able to get out of this ...

Oh and how can I tell if I have a timeout mismatch.  They should be raid drives.

smartctl -x /dev/sdX

This will give you both the sort of drive you have - yes if it's in a datacentre chances are it is a raid drive - and then search the output for Error Recovery Control. This is from my hard drive...

SCT capabilities:              (0x003f) SCT Status supported.
SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

You need error recovery to be supported. If it isn't ...


Cheers,
Curt

Cheers,
Wol
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux