Re: [PATCH] md: no longer compare spare disk superblock events in super_load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 12, 2019 at 12:04 AM Song Liu <liu.song.a23@xxxxxxxxx> wrote:
>
> On Mon, Sep 2, 2019 at 8:04 AM Yufen Yu <yuyufen@xxxxxxxxxx> wrote:
> >
> > We have a test case as follow:
> >
> >   mdadm -CR /dev/md1 -l 1 -n 4 /dev/sd[a-d] --assume-clean --bitmap=internal
> >   mdadm -S /dev/md1
> >   mdadm -A /dev/md1 /dev/sd[b-c] --run --force
> >
> >   mdadm --zero /dev/sda
> >   mdadm /dev/md1 -a /dev/sda
> >
> >   echo offline > /sys/block/sdc/device/state
> >   echo offline > /sys/block/sdb/device/state
> >   sleep 5
> >   mdadm -S /dev/md1
> >
> >   echo running > /sys/block/sdb/device/state
> >   echo running > /sys/block/sdc/device/state
> >   mdadm -A /dev/md1 /dev/sd[a-c] --run --force
> >
> > When we readd /dev/sda to the array, it started to do recovery.
> > After offline the other two disks in md1, the recovery have
> > been interrupted and superblock update info cannot be written
> > to the offline disks. While the spare disk (/dev/sda) can continue
> > to update superblock info.
> >
> > After stopping the array and assemble it, we found the array
> > run fail, with the follow kernel message:
> >
> > [  172.986064] md: kicking non-fresh sdb from array!
> > [  173.004210] md: kicking non-fresh sdc from array!
> > [  173.022383] md/raid1:md1: active with 0 out of 4 mirrors
> > [  173.022406] md1: failed to create bitmap (-5)
> > [  173.023466] md: md1 stopped.
> >
> > Since both sdb and sdc have the value of 'sb->events' smaller than
> > that in sda, they have been kicked from the array. However, the only
> > remained disk sda is in 'spare' state before stop and it cannot be
> > added to conf->mirrors[] array. In the end, raid array assemble and run fail.
> >
> > In fact, we can use the older disk sdb or sdc to assemble the array.
> > That means we should not choose the 'spare' disk as the fresh disk in
> > analyze_sbs().
> >
> > To fix the problem, we do not compare superblock events when it is
> > a spare disk, as same as validate_super.
> >
> > Signed-off-by: Yufen Yu <yuyufen@xxxxxxxxxx>
> > ---
> >  drivers/md/md.c | 27 +++++++++++++++++----------
> >  1 file changed, 17 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > index 24638ccedce4..350e1f152e97 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -1092,7 +1092,7 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
> >  {
> >         char b[BDEVNAME_SIZE], b2[BDEVNAME_SIZE];
> >         mdp_super_t *sb;
> > -       int ret;
> > +       int ret = 0;
> >
> >         /*
> >          * Calculate the position of the superblock (512byte sectors),
> > @@ -1160,10 +1160,13 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
> >                 }
> >                 ev1 = md_event(sb);
> >                 ev2 = md_event(refsb);
> > -               if (ev1 > ev2)
> > -                       ret = 1;
> > -               else
> > -                       ret = 0;
> > +
> > +               /* Insist on good event counter while assembling, except
> > +                * for spares (which don't need an event count) */
> > +               if (sb->disks[rdev->desc_nr].state & (
> > +                       (1<<MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE)))
>
> Can we just check 1<<MD_DISK_ACTIVE? I think MD_DISK_SYNC is not
> necessary.
>
> > +                       if (ev1 > ev2)
> > +                               ret = 1;
> >         }
> >         rdev->sectors = rdev->sb_start;
> >         /* Limit to 4TB as metadata cannot record more than that.
> > @@ -1513,7 +1516,7 @@ static __le32 calc_sb_1_csum(struct mdp_superblock_1 *sb)
> >  static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_version)
> >  {
> >         struct mdp_superblock_1 *sb;
> > -       int ret;
> > +       int ret = 0;
> >         sector_t sb_start;
> >         sector_t sectors;
> >         char b[BDEVNAME_SIZE], b2[BDEVNAME_SIZE];
> > @@ -1665,10 +1668,14 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
> >                 ev1 = le64_to_cpu(sb->events);
> >                 ev2 = le64_to_cpu(refsb->events);
> >
> > -               if (ev1 > ev2)
> > -                       ret = 1;
> > -               else
> > -                       ret = 0;
> > +               /* Insist of good event counter while assembling, except for
> > +                * spares (which don't need an event count) */
> > +               if (rdev->desc_nr >= 0 &&
> > +                   rdev->desc_nr < le32_to_cpu(sb->max_dev) &&
> > +                   (le16_to_cpu(sb->dev_roles[rdev->desc_nr]) < MD_DISK_ROLE_MAX ||
> > +                    le16_to_cpu(sb->dev_roles[rdev->desc_nr]) == MD_DISK_ROLE_JOURNAL))
>
> Can we use MD_DISK_ROLE_SPARE in this check?

Actually, this check is good. No need to fix. Sorry for the confusion.

Thanks,
Song



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux