On Tue, May 09 2017, Ram Ramesh wrote: > Today, I noticed that my RAID6 md0 was assembled in degraded state with > two drives in failed state after a pm-suspend and restart. Both of these > drives were attached toSAS9211-8I controller. The other drives are > attached to motherboard. I have not had this on a normal boot/reboot. > Also, in this particular case, mythtv recording was going on when > suspended and therefore as soon as resumed that used this md0. > > Upon inspection, it appears (I am not sure here) that mdadm assembled > the array even before the drives were ready to be used. All I had to do > was to remove and re-add them to bring the array back to "good" state. I > am wondering if there is a way to tell mdadm to wait for all drives to > be ready before assembling. Also, if there is something that I can add > to resume scripts that will help, please let me know. > > Kernel: Linux zym 3.13.0-106-generic #153-Ubuntu SMP > mdadm - v3.2.5 - 18th May 2012 > > Failed drives are HGST NAS and WD Gold with less than a year of usage. > So I doubt they are bad drives by any means. This is a question that needs to be addressed by your distro. mdadm just does what it is told to do by init/udev/systemd scripts. The preferred way for array startup to happen is that when udev discovers a new device, "mdadm --incremental $DEV" is run, and mdadm includes the device into an array as appropriate. mdadm will not normally activate the array until all expected devices have appeared. After some timeout "mdadm -IRs" or "mdadm --run /dev/mdXX" can be run to start the array even though it is degraded. The udev-* scripts and systemd/* unit files provided with current upstream mdadm do this, with a 30 second timeout. If a given distro doesn't use these scripts, you need to take it up with them. NeilBrown
Attachment:
signature.asc
Description: PGP signature