Re: Error in rebuild of two "layered" md devices in container

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 1 Aug 2012 19:52:51 +0200 Albert Pauw <albert.pauw@xxxxxxxxx> wrote:

> Hi Neil,
> 
> found another bug.
> 
> - Created a container with six disks
> - Created two md devices in it:
> 
> mdadm -CR /dev/md0 -l 6 -n 6 -z 50M
> mdadm -CR /dev/md1 -l 5 -n 6 -z 50M
> 
> The md devices are "layered" in the container across all disks.
> 
> They both get build and are online.
> 
> - Fail one disk, both md devices are affected
> - Remove disk
> - Clear superblock of removed disk
> - Add disk again (in essence, I just added a spare disk)
> 
> Now comes the error:
> 
> - md0 is rebuild
> - md1 is NOT rebuild

The reason for this is somewhat messy.
mdadm will currently only add a 'spare' device to an array which needs a
replacement device.
In DDF the whole device is either 'active' or 'spare'.  There isn't a concept
of 'partly active, partly spare'.
So when mdadm adds part of the disk to one array it stops being spare and
started being active.  So when mdadm looks for a spare to add to the second
array, there are no spare devices.

I can hack around it by allowing any non-failed device to be considered as a
spare but I need to find a better solution.  That might take a while.  I've
made a note on my to-do list, but it is a rather long list.

Thanks,
NeilBrown

diff --git a/super-ddf.c b/super-ddf.c
index d006a04..11b98f7 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -2616,7 +2616,7 @@ static int validate_geometry_ddf(struct supertype *st,
 	if (chunk && *chunk == UnSet)
 		*chunk = DEFAULT_CHUNK;
 
-
+	if (level == -1000000) level = LEVEL_CONTAINER;
 	if (level == LEVEL_CONTAINER) {
 		/* Must be a fresh device to add to a container */
 		return validate_geometry_ddf_container(st, level, layout,
@@ -3701,6 +3701,10 @@ static struct mdinfo *ddf_activate_spare(struct active_array *a,
 			} else if (ddf->phys->entries[dl->pdnum].type &
 				   __cpu_to_be16(DDF_Global_Spare)) {
 				is_global = 1;
+			} else if (!(ddf->phys->entries[dl->pdnum].state &
+				     __cpu_to_be16(DDF_Failed))) {
+				/* we can possibly use some of this */
+				is_global = 1;
 			}
 			if ( ! (is_dedicated ||
 				(is_global && global_ok))) {

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux