Re: [PATCH v3] md: no longer compare spare disk superblock events in super_load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Oct 15, 2019, at 8:48 PM, Yufen Yu <yuyufen@xxxxxxxxxx> wrote:
> 
>>>  drivers/md/md.c | 51 +++++++++++++++++++++++++++++--------------------
>>>  1 file changed, 30 insertions(+), 21 deletions(-)
>>> 
>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>>> index 1be7abeb24fd..1be1deca3e3a 100644
>>> --- a/drivers/md/md.c
>>> +++ b/drivers/md/md.c
>>> @@ -1097,7 +1097,7 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>>>  {
>>>         char b[BDEVNAME_SIZE], b2[BDEVNAME_SIZE];
>>>         mdp_super_t *sb;
>>> -       int ret;
>>> +       int ret = 0;
>>> 
>>>         /*
>>>          * Calculate the position of the superblock (512byte sectors),
>>> @@ -1111,14 +1111,12 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>>>         if (ret)
>>>                 return ret;
>>> 
>>> -       ret = -EINVAL;
>>> -
>> I think ret is handled correctly in existing code. I would not recommend this
>> type of refactoring.
> 
> Yes, there is no problem for existing code. But, after adding follow test:
> 
> +               if (sb->disks[rdev->desc_nr].state & (
> +                       (1<<MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE)))
> +                       if (ev1 > ev2)
> +                               ret = 1;
> 
> 
> If the current disk is spare or events is smaller, then ret will be set as default '-EINVAL',
> which is not expected. Just as super_1_load(), I refactor the return.
> 
> If you don't like it, I can modify the test as following without refactoring:
> 
> +               if (sb->disks[rdev->desc_nr].state & (
> +                       (1<<MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE)) &&
> +                       (ev1 > ev2))
>                        ret = 1;
>                else
>                        ret = 0;

I like this better. 

> 
>> 
>>>         bdevname(rdev->bdev, b);
>>>         sb = page_address(rdev->sb_page);
>>> 
>>>         if (sb->md_magic != MD_SB_MAGIC) {
>>>                 pr_warn("md: invalid raid superblock magic on %s\n", b);
>>> -               goto abort;
>>> +               return -EINVAL;
>>>         }
>>> 
>>>         if (sb->major_version != 0 ||
>>> @@ -1126,15 +1124,15 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>>>             sb->minor_version > 91) {
>>>                 pr_warn("Bad version number %d.%d on %s\n",
>>>                         sb->major_version, sb->minor_version, b);
>>> -               goto abort;
>>> +               return -EINVAL;
>>>         }
>>> 
>>>         if (sb->raid_disks <= 0)
>>> -               goto abort;
>>> +               return -EINVAL;
>>> 
>>>         if (md_csum_fold(calc_sb_csum(sb)) != md_csum_fold(sb->sb_csum)) {
>>>                 pr_warn("md: invalid superblock checksum on %s\n", b);
>>> -               goto abort;
>>> +               return -EINVAL;
>>>         }
>>> 
>>>         rdev->preferred_minor = sb->md_minor;
>>> @@ -1156,19 +1154,24 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>>>                 if (!md_uuid_equal(refsb, sb)) {
>>>                         pr_warn("md: %s has different UUID to %s\n",
>>>                                 b, bdevname(refdev->bdev,b2));
>>> -                       goto abort;
>>> +                       return -EINVAL;
>>>                 }
>>>                 if (!md_sb_equal(refsb, sb)) {
>>>                         pr_warn("md: %s has same UUID but different superblock to %s\n",
>>>                                 b, bdevname(refdev->bdev, b2));
>>> -                       goto abort;
>>> +                       return -EINVAL;
>>>                 }
>>>                 ev1 = md_event(sb);
>>>                 ev2 = md_event(refsb);
>>> -               if (ev1 > ev2)
>>> -                       ret = 1;
>>> -               else
>>> -                       ret = 0;
>>> +
>>> +               /*
>>> +                * Insist on good event counter while assembling, except
>>> +                * for spares (which don't need an event count)
>>> +                */
>>> +               if (sb->disks[rdev->desc_nr].state & (
>>> +                       (1<<MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE)))
>>> +                       if (ev1 > ev2)
>>> +                               ret = 1;
>> Just realized this:
>> 
>> If the first device being passed to load_super is a spare device, we still
>> have the same problem, no?
> 
> Good catch!
> 
> My local test with one spare disk and one normal disk shows that,
> no matter what order of two disks in 'mdadm -A', spare disk cannot
> be the first device in mddev->disks list. I think mdadm tool have
> ordered all disks by events. So, test case will be OK.
> 
> But, I think we need to resolve the problem completely.

Yes, let's resolve this in kernel. 

Thanks,
Song



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux