Re: [PATCH] md: Fix bug where new drives added to an md array sometimes don't sync properly.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




This patch has resolved the immediate issue I was having on 2.6.18 with RAID10. Previous to this change, after removing a device from the array (with mdadm --remove), physically pulling the device and changing/re-inserting, the "Number" of the new device would be incremented on top of the highest-present device in the array. Now, it resumes its previous place.

Does this look to be 'correct' output for a 14-drive array, which dev 8 was failed/removed from then "add"'ed? I'm trying to determine why the device doesn't get pulled back into the active configuration and re-synced. Any comments?

Thanks!

/eli

For example, currently when device dm-8 is removed it shows up like this:



    Number   Major   Minor   RaidDevice State
       0     253        0        0      active sync   /dev/dm-0
       1     253        1        1      active sync   /dev/dm-1
       2     253        2        2      active sync   /dev/dm-2
       3     253        3        3      active sync   /dev/dm-3
       4     253        4        4      active sync   /dev/dm-4
       5     253        5        5      active sync   /dev/dm-5
       6     253        6        6      active sync   /dev/dm-6
       7     253        7        7      active sync   /dev/dm-7
       8       0        0        8      removed
       9     253        9        9      active sync   /dev/dm-9
      10     253       10       10      active sync   /dev/dm-10
      11     253       11       11      active sync   /dev/dm-11
      12     253       12       12      active sync   /dev/dm-12
      13     253       13       13      active sync   /dev/dm-13

       8     253        8        -      spare   /dev/dm-8


Previously however, it would come back with the "Number" as 14, not 8 as it should. Shortly thereafter things got all out of whack, in addition to just not working properly :) Now I've just got to figure out how to get the re-introduced drive to participate in the array again like it should.

Eli Stair wrote:


I'm actually seeing similar behaviour on RAID10 (2.6.18), where after
removing a drive from an array re-adding it sometimes results in it
still being listed as a faulty-spare and not being "taken" for resync.
In the same scenario, after swapping drives, doing a fail,remove, then
an 'add' doesn't work, only a re-add will even get the drive listed by
MDADM.


What's the failure mode/symptoms that this patch is resolving?

Is it possible this affects the RAID10 module/mode as well?  If not,
I'll start a new thread for that.  I'm testing this patch to see if it
does remedy the situation on RAID10, and will update after some
significant testing.


/eli








NeilBrown wrote:
 > There is a nasty bug in md in 2.6.18 affecting at least raid1.
 > This fixes it (and has already been sent to stable@xxxxxxxxxx).
 >
 > ### Comments for Changeset
 >
 > This fixes a bug introduced in 2.6.18.
 >
 > If a drive is added to a raid1 using older tools (mdadm-1.x or
 > raidtools) then it will be included in the array without any resync
 > happening.
 >
 > It has been submitted for 2.6.18.1.
 >
 >
 > Signed-off-by: Neil Brown <neilb@xxxxxxx>
 >
 > ### Diffstat output
 >  ./drivers/md/md.c |    1 +
 >  1 file changed, 1 insertion(+)
 >
 > diff .prev/drivers/md/md.c ./drivers/md/md.c
 > --- .prev/drivers/md/md.c       2006-09-29 11:51:39.000000000 +1000
 > +++ ./drivers/md/md.c   2006-10-05 16:40:51.000000000 +1000
 > @@ -3849,6 +3849,7 @@ static int hot_add_disk(mddev_t * mddev,
 >         }
 >         clear_bit(In_sync, &rdev->flags);
 >         rdev->desc_nr = -1;
 > +       rdev->saved_raid_disk = -1;
 >         err = bind_rdev_to_array(rdev, mddev);
 >         if (err)
 >                 goto abort_export;
 > -
 > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
 > the body of a message to majordomo@xxxxxxxxxxxxxxx
 > More majordomo info at  http://vger.kernel.org/majordomo-info.html
 >

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux