I have what I think is a closely related issue, but since this thread is pretty long already, I will start with a brief description to see if this should move to a new thread. I am seeing a repeatable instance where a drive is added to a raid set, then almost immediately rejected. - linux 3.2.15 from kernel.org - mdadm 3.2.5, but I saw this with 3.2, as well - 8 drive, raid 6 Sequence: Remove the device - md created and fully synced. (/dev/md1 in my case) - no IO - pull a drive - issue dd read to /dev/md1 to get it to recognize missing device - mdadm /dev/md1 -r /dev/sdc2 (for example) to remove device from the md Add device back in - insert drive - mdadm --zero-superblock /dev/sdc2 (or sd<x>, whatever it came back in as) - mdadm /dev/md1 -a /dev/sdc2 I see the following about 50% of the time: [ 1813.355806] md: bind<sdc2> [ 1813.367430] md: recovery of RAID array md1 [ 1813.371570] md: minimum _guaranteed_ speed: 80000 KB/sec/disk. [ 1813.377481] md: using maximum available idle IO bandwidth (but not more than 800000 KB/sec) for recovery. [ 1813.387039] md: using 128k window, over a total of 62467072k. [ 1813.392789] md/raid:md1: Disk failure on sdc2, disabling device. [ 1813.392791] md/raid:md1: Operation continuing on 7 devices. [ 1813.404346] md: md1: recovery done. I recently found that after this series of events, if I once again remove the drive from the raid set (mdadm --remove), and do a read of the md device via dd, I can then successfully add the device back to the md with the --zero-superblock command followed by the mdadm --add command. That 'trick' has worked in the first three times that I've tried it, so not a lot of data points, but interesting nonetheless. Finally, I applied the patch referenced earlier in this thread and it did not affect the behavior I've described above. This has happened on different systems and with different drives; i.e. it is not a drive failure issue. Thoughts? Thanks. On Mon, Jul 2, 2012 at 1:57 AM, NeilBrown <neilb@xxxxxxx> wrote: > > On Wed, 27 Jun 2012 19:40:52 +0300 Alexander Lyakas > <alex.bolshoy@xxxxxxxxx> > wrote: > > > Hi Neil, > > > > >> > > >> I would still think that there is value in recoding in a superblock > > >> that a drive is recovering. > > > > > > Probably. It is a bit unfortunate that if you stop an array that is > > > recovering after a --re-add, you cannot simply 'assemble' it again and > > > get it back to the same state. > > > I'll think more on that. > > > > As I mentioned, I see the additional re-add as a minor thing, but > > agree it's better to fix it. The fact that we don't know that the > > drive is being recovered, bothers me more. Because user might look at > > the superblock, and assume the data on the drive is consistent to some > > point in time (time of the drive failure). While the actual data, > > while doing bitmap-based recovery, is unusable until recovery > > successfully completes. So the user might think it's okay to try to > > run his app on this drive. > > Yes, please think about this. > > > > > > > > Meanwhile, this patch might address your other problem. It allows > > > --re-add > > > to work if a non-bitmap rebuild fails and is then re-added. > > > > > > diff --git a/drivers/md/md.c b/drivers/md/md.c > > > index c601c4b..d31852e 100644 > > > --- a/drivers/md/md.c > > > +++ b/drivers/md/md.c > > > @@ -5784,7 +5784,7 @@ static int add_new_disk(struct mddev * mddev, > > > mdu_disk_info_t *info) > > > super_types[mddev->major_version]. > > > validate_super(mddev, rdev); > > > if ((info->state & (1<<MD_DISK_SYNC)) && > > > - (!test_bit(In_sync, &rdev->flags) || > > > + (test_bit(Faulty, &rdev->flags) || > > > rdev->raid_disk != info->raid_disk)) { > > > /* This was a hot-add request, but events > > > doesn't > > > * match, so reject it. > > > > > > > I have tested a slightly different patch that you suggested earlier - > > just removing the !test_bit(In_sync, &rdev->flags) check. I confirm > > that it solves the problem. > > > > The Faulty bit check seems redundant to me, because: > > - it can be set by only by validate_super() and only if that drive's > > role is 0xfffe in sb->roles[] array > > - Long time ago I asked you, how can it happen that a drive thinks > > about *itself* that it is Faulty (has 0xfffe for its role in its own > > superblock), and you said this should never happen. > > Yes, you are right. I've remove the test on 'Faulty' - the test on the > raid_disk number is sufficient. > > > > > Anyways, I tested also the patch you suggested, and it also works. > > Thanks! > > > > > Is there any chance to see this fix in ubuntu-precise? > > Not really up to me. It doesn't fix a crash or corruption so I'm not sure > it > is -stable material .... though maybe it is if it fixes a regression. > > I suggest you ask the Ubuntu kernel people after it appears in 3.5-rc > (hopefully later this week). > > Thanks, > NeilBrown > > > > > > Thanks again for your support, > > Alex. > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html