Re: Some md/mdadm bugs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 06 Feb 2012 18:07:38 +0100 Asdo <asdo@xxxxxxxxxxxxx> wrote:

> On 02/02/12 23:58, Asdo wrote:
> >
> >>> Now it doesn't happen:
> >>> When I reinserted the disk, udev triggered the --incremental, to
> >>> reinsert the device, but mdadm refused to do anything because the old
> >>> slot was still occupied with a failed+detached device. I manually
> >>> removed the device from the raid then I ran --incremental, but mdadm
> >>> still refused to re-add the device to the RAID because the array was
> >>> running. I think that if it is a re-add, and especially if the 
> >>> bitmap is
> >>> active, I can't think of a situation in which the user would *not* want
> >>> to do an incremental re-add even if the array is running.
> >> Hmmm.. that doesn't seem right.  What version of mdadm are you running?
> >
> > 3.1.4
> >
> >> Maybe a newer one would get this right.
> > I need to try...
> > I think I need that.
> 
> Hi Neil,
> 
> Still some problems on mdadm 3.2.2 (from Ubuntu Precise) apparently:
> 
> Problem #1:
> 
> # mdadm -If /dev/sda4
> mdadm: incremental removal requires a kernel device name, not a file: 
> /dev/sda4
> 
> however this works:
> 
> # mdadm -If sda4
> mdadm: set sda4 faulty in md3
> mdadm: hot removed sda4 from md3
> 
> Is this by design?

Yes.

>                     Would your udev rule
> ACTION=="remove", RUN+="/sbin/mdadm -If $name"
> trigger the first or the second kind of invocation?

Yes.

> 
> 
> Problem #2:
> 
> by reinserting sda, it became sdax, and the array is still running like 
> this:
> 
> md3 : active raid1 sdb4[2]
>        10485688 blocks super 1.0 [2/1] [_U]
>        bitmap: 0/160 pages [0KB], 32KB chunk
> 
> please note the bitmap is active

True, but there is nothing in it (0 pages).  That implies that no bits are
set.  I guess that is possible if nothing has been written to the array since
the other device was removed.

> 
> so now I'm trying auto hot-add:
> 
> # mdadm  -I /dev/sdax4
> mdadm: not adding /dev/sdax4 to active array (without --run) /dev/md3
> 
> still the old problem I mentioned with 3.1.4.

I need to see -E and -X output on both drives to be able to see what is
happening here.  Also the content of /etc/mdadm.conf might be relevant.
If you could supply that info I might be able to explain what is happening.



> Trying more ways: (even with the "--run" which is suggested)
> 
> # mdadm --run -I /dev/sdax4
> mdadm: -I would set mdadm mode to "incremental", but it is already set 
> to "misc".
> 
> # mdadm -I --run /dev/sdax4
> mdadm: failed to add /dev/sdax4 to /dev/md3: Invalid argument.
> 

Hmm... I'm able to reproduce something like this.

Following patch seems to fix it, but I need to check the code more
thoroughly to be sure.  Note that this will *not* fix the "not adding ... not
active array" problem.

NeilBrown


diff --git a/Incremental.c b/Incremental.c
index 60175af..2be0d05 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -415,19 +415,19 @@ int Incremental(char *devname, int verbose, int runstop,
 				goto out_unlock;
 			}
 		}
-		info2.disk.major = major(stb.st_rdev);
-		info2.disk.minor = minor(stb.st_rdev);
+		info.disk.major = major(stb.st_rdev);
+		info.disk.minor = minor(stb.st_rdev);
 		/* add disk needs to know about containers */
 		if (st->ss->external)
 			sra->array.level = LEVEL_CONTAINER;
-		err = add_disk(mdfd, st, sra, &info2);
+		err = add_disk(mdfd, st, sra, &info);
 		if (err < 0 && errno == EBUSY) {
 			/* could be another device present with the same
 			 * disk.number. Find and reject any such
 			 */
 			find_reject(mdfd, st, sra, info.disk.number,
 				    info.events, verbose, chosen_name);
-			err = add_disk(mdfd, st, sra, &info2);
+			err = add_disk(mdfd, st, sra, &info);
 		}
 		if (err < 0) {
 			fprintf(stderr, Name ": failed to add %s to %s: %s.\n",

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux