On 08/30/2018 10:32 AM, Niklas Hambüchen wrote:
Hi,
I'm taking this to the mailing list because I've failed to get a clear
answer to this so far.
There are multiple reports floating around the Internet where users
report that after a HD gets removed, destroyed and/or replaced and
then the system is rebooted, raid5/raid1 configurations turn into
raid0 as shown by mdadm after the reboot.
1.
https://unix.stackexchange.com/questions/257599/raid5-array-reassembles-as-raid0
2.
https://superuser.com/questions/117824/how-to-get-an-inactive-raid-device-working-again/118277#118277
3.
http://fibrevillage.com/storage/676-how-to-fix-linux-mdadm-inactive-array
There's lots of head-scratching and inconclusiveness in there; people
manually fix the situation and then go on without detail investigation.
I have observed this now on one server too, and decided to test it on
my desktop where I have an mdadm RAID1 running:
Before the reboot, `mdadm --detail` tells me
Raid Level : raid1
and the usual output.
Then I poweroff, unplug the SATA cable of one of the disks, power on,
and get
Raid Level : raid0
instead.
`cat /proc/mdstat` shows me something like:
md0 : inactive sdc2[0](S)
9766302720 blocks super 1.2
in that case.
This is opposed to what happens if I unplug an HD during operation
without reboot, in which case I get the usual "degraded" [_U] output:
md0 : active (auto-read-only) raid1 sde[2]
2930135360 blocks super 1.2 [2/1] [_U]
(Versions used are Ubuntu 16.04, stock kernel 4.15.0-33-generic and
stock mdadm v3.3)
To my knowledge, the output of raid level shows raid0 maybe caused by
below in set_array_info.
memset(&inf, 0, sizeof(inf));
inf.major_version = info->array.major_version;
inf.minor_version = info->array.minor_version;
rv = md_set_array_info(mdfd, &inf);
And mdadm only calls two ioctls (SET_ARRAY_INFO and ADD_NEW_DISK) to the
array during
the reboot stage.
Questions:
Is it expected that raid1 turns into raid0 in this way when during a
reboot an expected device is not present (e.g. because it is unplugged
or was replaced)?
If yes, what is the idea behind that, and why doesn't it go into the
normal degraded mode instead?
Is it possible to achieve that? I had hoped that I would be able to
continue booting into a degraded system if a disk fails during a
reboot (and then be notified of the degradation by mdadm as usual),
but this isn't the case if an array comes back as raid0 and inactive
after reboot.
Finally, if these topics are already explained somewhere, where can I
read more about it?
Maybe we need to call do_md_run when assembling an array, need to
investigate it.
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 12e8fbee8fed..8516778ca650 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6757,6 +6757,7 @@ static int set_array_info(struct mddev *mddev,
mdu_array_info_t *info)
* is some minimal configuration.
*/
mddev->ctime = ktime_get_real_seconds();
+ do_md_run(mddev);
return 0;
}
Thanks,
Guoqing