Re: How to delay mdadm assembly until all component drives are recognized/ready?

NeilBrown <neilb@xxxxxxxx> · Wed, 17 May 2017 07:25:10 +1000

On Tue, May 09 2017, Ram Ramesh wrote:

> Today, I noticed that my RAID6 md0 was assembled in degraded state with 
> two drives in failed state after a pm-suspend and restart. Both of these 
> drives were attached toSAS9211-8I controller. The other drives are 
> attached to motherboard. I have not had this on a normal boot/reboot. 
> Also, in this particular case, mythtv recording was going on when 
> suspended and therefore as soon as resumed that used this md0.
>
> Upon inspection, it appears (I am not sure here) that mdadm assembled 
> the array even before the drives were ready to be used. All I had to do 
> was to remove and re-add them to bring the array back to "good" state. I 
> am wondering if there is a way to tell mdadm to wait for all drives to 
> be ready before assembling. Also, if there is something that I can add 
> to resume scripts that will help, please let me know.
>
> Kernel: Linux zym 3.13.0-106-generic #153-Ubuntu SMP
> mdadm - v3.2.5 - 18th May 2012
>
> Failed drives are HGST NAS and WD Gold with less than a year of usage. 
> So I doubt they are bad drives by any means.

This is a question that needs to be addressed by your distro.  mdadm
just does what it is told to do by init/udev/systemd scripts.

The preferred way for array startup to happen is that when udev
discovers a new device, "mdadm --incremental $DEV" is run, and mdadm
includes the device into an array as appropriate.  mdadm will not
normally activate the array until all expected devices have appeared.
After some timeout "mdadm -IRs" or "mdadm --run /dev/mdXX" can be run to
start the array even though it is degraded.

The udev-* scripts and systemd/* unit files provided with current
upstream mdadm do this, with a 30 second timeout.
If a given distro doesn't use these scripts, you need to take it up
with them.

NeilBrown
Attachment:
signature.asc

Description: PGP signature