Re: [PATCH v2] md: don't let spare disk become the fresh disk in analyze_sbs()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Sep 10, 2019, at 12:51 PM, Yufen Yu <yuyufen@xxxxxxxxxx> wrote:
> 
> We have a test case as follow:
> 
>  mdadm -CR /dev/md1 -l 1 -n 4 /dev/sd[a-d] --assume-clean --bitmap=internal
>  mdadm -S /dev/md1
>  mdadm -A /dev/md1 /dev/sd[b-c] --run --force
> 
>  mdadm --zero /dev/sda
>  mdadm /dev/md1 -a /dev/sda
> 
>  echo offline > /sys/block/sdc/device/state
>  echo offline > /sys/block/sdb/device/state
>  sleep 5
>  mdadm -S /dev/md1
> 
>  echo running > /sys/block/sdb/device/state
>  echo running > /sys/block/sdc/device/state
>  mdadm -A /dev/md1 /dev/sd[a-c] --run --force
> 
> When we readd /dev/sda to the array, it started to do recovery.
> After offline the other two disks in md1, the recovery have
> been interrupted and superblock update info cannot be written
> to the offline disks. While the spare disk (/dev/sda) can continue
> to update superblock info.
> 
> After stopping the array and assemble it, we found the array
> run fail, with the follow kernel message:
> 
> [  172.986064] md: kicking non-fresh sdb from array!
> [  173.004210] md: kicking non-fresh sdc from array!
> [  173.022383] md/raid1:md1: active with 0 out of 4 mirrors
> [  173.022406] md1: failed to create bitmap (-5)
> [  173.023466] md: md1 stopped.
> 
> Since both sdb and sdc have the value of 'sb->events' smaller than
> that in sda, they have been kicked from the array. However, the only
> remained disk sda is in 'spare' state before stop and it cannot be
> added to conf->mirrors[] array. In the end, raid array assemble and run fail.
> 
> In fact, we can use the older disk sdb or sdc to assemble the array.
> That means we should not choose the 'spare' disk as the fresh disk in
> analyze_sbs().
> 
> Signed-off-by: Yufen Yu <yuyufen@xxxxxxxxxx>
> ---
> drivers/md/md.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 48 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 24638ccedce4..5a566750afc1 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -3571,18 +3571,56 @@ static struct md_rdev *md_import_device(dev_t newdev, int super_format, int supe
> 	return ERR_PTR(err);
> }
> 
> +static int disk_is_spare(struct mddev *mddev, struct md_rdev *rdev)
> +{
> +	int err;
> +
> +	err = super_types[mddev->major_version].
> +			load_super(rdev, NULL, mddev->minor_version);
> +	if (err < 0)
> +		return err;
> +
> +	if (mddev->major_version == 0) {
> +		mdp_super_t *sb;
> +		sb = page_address(rdev->sb_page);
> +
> +		if (sb->disks[rdev->desc_nr].state &
> +			((1<<MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE)))
> +			return 0;
> +
> +	} else if (mddev->major_version == 1){
> +		struct mdp_superblock_1 *sb;
> +		sb = page_address(rdev->sb_page);
> +
> +		if (rdev->desc_nr >= 0 &&
> +			rdev->desc_nr < le32_to_cpu(sb->max_dev) &&
> +			(le16_to_cpu(sb->dev_roles[rdev->desc_nr]) < MD_DISK_ROLE_MAX ||
> +			le16_to_cpu(sb->dev_roles[rdev->desc_nr]) == MD_DISK_ROLE_JOURNAL))
> +			return 0;
> +	}
> +
> +	return 1;
> +}

We should add "disk_is_spare" to struct super_type. 

Thanks,
Song



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux