Re: RAID showing all devices as spares after partial unplug

Mike Hartman <mike@xxxxxxxxxxxxxxxxxxxx> · Sat, 17 Sep 2011 21:34:29 -0400

On Sat, Sep 17, 2011 at 9:16 PM, Jim Schatzman
<james.schatzman@xxxxxxxxxxxxxxxx> wrote:
> Mike-
>
> I have seen very similar problems. I regret that electronics engineers cannot design more secure connectors. eSata connector are terrible - they come loose at the slightest tug. For this reason, I am gradually abandoning eSata enclosures and going to internal drives only. Fortunately, there are some inexpensive RAID chassis available now.
>
> I tried the same thing as you. I removed the array(s) from mdadm.conf and I wrote a script for "/etc/cron.reboot" which assembles the array, "no-degraded". Doing this seems to minimize the damage caused by drives prior to a reboot. However, if the drives are disconnected while Linux is up, then either the array will stay up but some drives will become stale or the array will be stopped. The behavior I usually see is that all the drives that went offline now become "spare".
>

That sounds similar, although I only had 4/11 go offline and now
they're ALL spare.

> It would be nice if md would just reassemble the array once all the drives come back online. Unfortunately, it doesn't. I would run mdadm -E against all the drives/partitions, verifying that the metadata all indicates that they are/were part of the expected array.

I ran mdadm -E and they all correctly appear as part of the array:

for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
-E $d | grep Role; done

/dev/sdc1
   Device Role : Active device 5
/dev/sdd1
   Device Role : Active device 4
/dev/sdf1
   Device Role : Active device 2
/dev/sdh1
   Device Role : Active device 0
/dev/sdj1
   Device Role : Active device 10
/dev/sdk1
   Device Role : Active device 7
/dev/sdl1
   Device Role : Active device 8
/dev/sdm1
   Device Role : Active device 9
/dev/sdn1
   Device Role : Active device 1
/dev/md1p1
   Device Role : Active device 3
/dev/md3p1
   Device Role : Active device 6

But they have varying event counts (although all pretty close together):

for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
-E $d | grep Event; done

/dev/sdc1
         Events : 1756743
/dev/sdd1
         Events : 1756743
/dev/sdf1
         Events : 1756737
/dev/sdh1
         Events : 1756737
/dev/sdj1
         Events : 1756743
/dev/sdk1
         Events : 1756743
/dev/sdl1
         Events : 1756743
/dev/sdm1
         Events : 1756743
/dev/sdn1
         Events : 1756743
/dev/md1p1
         Events : 1756737
/dev/md3p1
         Events : 1756740

And they don't seem to agree on the overall status of the array. The
ones that never went down seem to think the array is missing 4 nodes,
while the ones that went down seem to think all the nodes are good:

for d in /dev/sd[cdfhjklmn]1 /dev/md1p1 /dev/md3p1; do echo $d; mdadm
-E $d | grep State; done

/dev/sdc1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/sdd1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/sdf1
          State : clean
   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
/dev/sdh1
          State : clean
   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
/dev/sdj1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/sdk1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/sdl1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/sdm1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/sdn1
          State : clean
   Array State : .A..AA.AAAA ('A' == active, '.' == missing)
/dev/md1p1
          State : clean
   Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
/dev/md3p1
          State : clean
   Array State : .A..AAAAAAA ('A' == active, '.' == missing)

So it seems like overall the array is intact, I just need to convince
it of that fact.

> At that point, you should be able ro re-create the RAID. Be sure you list the drives in the correct order. Once the array is going again, mount the resulting partitions RO and verify that the data is o.k. before going RW.

Could you be more specific about how exactly I should re-create the
RAID? Should I just do --assemble --force?

>
> Jim
>
>
>
>
>
>
>
>
>
> At 04:16 PM 9/17/2011, Mike Hartman wrote:
>>I should add that the mdadm command in question actually ends in
>>/dev/md0, not /dev/md3 (that's for another array). So the device name
>>for the array I'm seeing in mdstat DOES match the one in the assemble
>>command.
>>
>>On Sat, Sep 17, 2011 at 4:39 PM, Mike Hartman <mike@xxxxxxxxxxxxxxxxxxxx> wrote:
>>> I have 11 drives in a RAID 6 array. 6 are plugged into one esata
>>> enclosure, the other 4 are in another. These esata cables are prone to
>>> loosening when I'm working on nearby hardware.
>>>
>>> If that happens and I start the host up, big chunks of the array are
>>> missing and things could get ugly. Thus I cooked up a custom startup
>>> script that verifies each device is present before starting the array
>>> with
>>>
>>> mdadm --assemble --no-degraded -u 4fd7659f:12044eff:ba25240d:
>>> de22249d /dev/md3
>>>
>>> So I thought I was covered. In case something got unplugged I would
>>> see the array failing to start at boot and I could shut down, fix the
>>> cables and try again. However, I hit a new scenario today where one of
>>> the plugs was loosened while everything was turned on.
>>>
>>> The good news is that there should have been no activity on the array
>>> when this happened, particularly write activity. It's a big media
>>> partition and sees much less writing then reading. I'm also the only
>>> one that uses it and I know I wasn't transferring anything. The system
>>> also seems to have immediately marked the filesystem read-only,
>>> because I discovered the issue when I went to write to it later and
>>> got a "read-only filesystem" error. So I believe the state of the
>>> drives should be the same - nothing should be out of sync.
>>>
>>> However, I shut the system down, fixed the cables and brought it back
>>> up. All the devices are detected by my script and it tries to start
>>> the array with the command I posted above, but I've ended up with
>>> this:
>>>
>>> md0 : inactive sdn1[1](S) sdj1[9](S) sdm1[10](S) sdl1[11](S)
>>> sdk1[12](S) md3p1[8](S) sdc1[6](S) sdd1[5](S) md1p1[4](S) sdf1[3](S)
>>> sdh1[0](S)
>>>       16113893731 blocks super 1.2
>>>
>>> Instead of all coming back up, or still showing the unplugged drives
>>> missing, everything is a spare? I'm suitably disturbed.
>>>
>>> It seems to me that if the data on the drives still reflects the
>>> last-good data from the array (and since no writing was going on it
>>> should) then this is just a matter of some metadata getting messed up
>>> and it should be fixable. Can someone please walk me through the
>>> commands to do that?
>>>
>>> Mike
>>>
>>--
>>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>the body of a message to majordomo@xxxxxxxxxxxxxxx
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html