Re: raid1 becoming raid0 when device is removed before reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 08/31/2018 05:18 PM, Guoqing Jiang wrote:


On 08/30/2018 10:32 AM, Niklas Hambüchen wrote:
Hi,

I'm taking this to the mailing list because I've failed to get a clear answer to this so far.

There are multiple reports floating around the Internet where users report that after a HD gets removed, destroyed and/or replaced and then the system is rebooted, raid5/raid1 configurations turn into raid0 as shown by mdadm after the reboot.

1. https://unix.stackexchange.com/questions/257599/raid5-array-reassembles-as-raid0 2. https://superuser.com/questions/117824/how-to-get-an-inactive-raid-device-working-again/118277#118277 3. http://fibrevillage.com/storage/676-how-to-fix-linux-mdadm-inactive-array

There's lots of head-scratching and inconclusiveness in there; people manually fix the situation and then go on without detail investigation.

I have observed this now on one server too, and decided to test it on my desktop where I have an mdadm RAID1 running:

Before the reboot, `mdadm --detail` tells me

     Raid Level : raid1

and the usual output.
Then I poweroff, unplug the SATA cable of one of the disks, power on, and get

     Raid Level : raid0

instead.

`cat /proc/mdstat` shows me something like:

  md0 : inactive sdc2[0](S)
        9766302720 blocks super 1.2

in that case.

This is opposed to what happens if I unplug an HD during operation without reboot, in which case I get the usual "degraded" [_U] output:

  md0 : active (auto-read-only) raid1 sde[2]
        2930135360 blocks super 1.2 [2/1] [_U]

(Versions used are Ubuntu 16.04, stock kernel 4.15.0-33-generic and stock mdadm v3.3)


To my knowledge, the output of raid level shows raid0 maybe caused by below in set_array_info.

        memset(&inf, 0, sizeof(inf));
        inf.major_version = info->array.major_version;
        inf.minor_version = info->array.minor_version;
        rv = md_set_array_info(mdfd, &inf);

And mdadm only calls two ioctls (SET_ARRAY_INFO and ADD_NEW_DISK) to the array during
the reboot stage.


Questions:

Is it expected that raid1 turns into raid0 in this way when during a reboot an expected device is not present (e.g. because it is unplugged or was replaced)? If yes, what is the idea behind that, and why doesn't it go into the normal degraded mode instead? Is it possible to achieve that? I had hoped that I would be able to continue booting into a degraded system if a disk fails during a reboot (and then be notified of the degradation by mdadm as usual), but this isn't the case if an array comes back as raid0 and inactive after reboot. Finally, if these topics are already explained somewhere, where can I read more about it?

Maybe we need to call do_md_run when assembling an array, need to investigate it.

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 12e8fbee8fed..8516778ca650 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6757,6 +6757,7 @@ static int set_array_info(struct mddev *mddev, mdu_array_info_t *info)
                 * is some minimal configuration.
                 */
                mddev->ctime         = ktime_get_real_seconds();
+               do_md_run(mddev);
                return 0;
        }

It doesn't work, actually the array can be activated by "echo active > /sys/block/md0/md/array_state".

Thanks,
Guoqing



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux