On Mon, 06 Feb 2012 18:07:38 +0100 Asdo <asdo@xxxxxxxxxxxxx> wrote: > On 02/02/12 23:58, Asdo wrote: > > > >>> Now it doesn't happen: > >>> When I reinserted the disk, udev triggered the --incremental, to > >>> reinsert the device, but mdadm refused to do anything because the old > >>> slot was still occupied with a failed+detached device. I manually > >>> removed the device from the raid then I ran --incremental, but mdadm > >>> still refused to re-add the device to the RAID because the array was > >>> running. I think that if it is a re-add, and especially if the > >>> bitmap is > >>> active, I can't think of a situation in which the user would *not* want > >>> to do an incremental re-add even if the array is running. > >> Hmmm.. that doesn't seem right. What version of mdadm are you running? > > > > 3.1.4 > > > >> Maybe a newer one would get this right. > > I need to try... > > I think I need that. > > Hi Neil, > > Still some problems on mdadm 3.2.2 (from Ubuntu Precise) apparently: > > Problem #1: > > # mdadm -If /dev/sda4 > mdadm: incremental removal requires a kernel device name, not a file: > /dev/sda4 > > however this works: > > # mdadm -If sda4 > mdadm: set sda4 faulty in md3 > mdadm: hot removed sda4 from md3 > > Is this by design? Yes. > Would your udev rule > ACTION=="remove", RUN+="/sbin/mdadm -If $name" > trigger the first or the second kind of invocation? Yes. > > > Problem #2: > > by reinserting sda, it became sdax, and the array is still running like > this: > > md3 : active raid1 sdb4[2] > 10485688 blocks super 1.0 [2/1] [_U] > bitmap: 0/160 pages [0KB], 32KB chunk > > please note the bitmap is active True, but there is nothing in it (0 pages). That implies that no bits are set. I guess that is possible if nothing has been written to the array since the other device was removed. > > so now I'm trying auto hot-add: > > # mdadm -I /dev/sdax4 > mdadm: not adding /dev/sdax4 to active array (without --run) /dev/md3 > > still the old problem I mentioned with 3.1.4. I need to see -E and -X output on both drives to be able to see what is happening here. Also the content of /etc/mdadm.conf might be relevant. If you could supply that info I might be able to explain what is happening. > Trying more ways: (even with the "--run" which is suggested) > > # mdadm --run -I /dev/sdax4 > mdadm: -I would set mdadm mode to "incremental", but it is already set > to "misc". > > # mdadm -I --run /dev/sdax4 > mdadm: failed to add /dev/sdax4 to /dev/md3: Invalid argument. > Hmm... I'm able to reproduce something like this. Following patch seems to fix it, but I need to check the code more thoroughly to be sure. Note that this will *not* fix the "not adding ... not active array" problem. NeilBrown diff --git a/Incremental.c b/Incremental.c index 60175af..2be0d05 100644 --- a/Incremental.c +++ b/Incremental.c @@ -415,19 +415,19 @@ int Incremental(char *devname, int verbose, int runstop, goto out_unlock; } } - info2.disk.major = major(stb.st_rdev); - info2.disk.minor = minor(stb.st_rdev); + info.disk.major = major(stb.st_rdev); + info.disk.minor = minor(stb.st_rdev); /* add disk needs to know about containers */ if (st->ss->external) sra->array.level = LEVEL_CONTAINER; - err = add_disk(mdfd, st, sra, &info2); + err = add_disk(mdfd, st, sra, &info); if (err < 0 && errno == EBUSY) { /* could be another device present with the same * disk.number. Find and reject any such */ find_reject(mdfd, st, sra, info.disk.number, info.events, verbose, chosen_name); - err = add_disk(mdfd, st, sra, &info2); + err = add_disk(mdfd, st, sra, &info); } if (err < 0) { fprintf(stderr, Name ": failed to add %s to %s: %s.\n",
Attachment:
signature.asc
Description: PGP signature