Re: Error in rebuild of two "layered" md devices in container

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Neil,

as a first test I can confirm that this fixes the problem with the layered md devices in a container.

So far so good on this.

Thanks,

regards,

Albert

On 08/15/2012 01:43 AM, NeilBrown wrote:
On Wed, 1 Aug 2012 19:52:51 +0200 Albert Pauw <albert.pauw@xxxxxxxxx> wrote:

Hi Neil,

found another bug.

- Created a container with six disks
- Created two md devices in it:

mdadm -CR /dev/md0 -l 6 -n 6 -z 50M
mdadm -CR /dev/md1 -l 5 -n 6 -z 50M

The md devices are "layered" in the container across all disks.

They both get build and are online.

- Fail one disk, both md devices are affected
- Remove disk
- Clear superblock of removed disk
- Add disk again (in essence, I just added a spare disk)

Now comes the error:

- md0 is rebuild
- md1 is NOT rebuild
The reason for this is somewhat messy.
mdadm will currently only add a 'spare' device to an array which needs a
replacement device.
In DDF the whole device is either 'active' or 'spare'.  There isn't a concept
of 'partly active, partly spare'.
So when mdadm adds part of the disk to one array it stops being spare and
started being active.  So when mdadm looks for a spare to add to the second
array, there are no spare devices.

I can hack around it by allowing any non-failed device to be considered as a
spare but I need to find a better solution.  That might take a while.  I've
made a note on my to-do list, but it is a rather long list.

Thanks,
NeilBrown

diff --git a/super-ddf.c b/super-ddf.c
index d006a04..11b98f7 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -2616,7 +2616,7 @@ static int validate_geometry_ddf(struct supertype *st,
  	if (chunk && *chunk == UnSet)
  		*chunk = DEFAULT_CHUNK;
-
+	if (level == -1000000) level = LEVEL_CONTAINER;
  	if (level == LEVEL_CONTAINER) {
  		/* Must be a fresh device to add to a container */
  		return validate_geometry_ddf_container(st, level, layout,
@@ -3701,6 +3701,10 @@ static struct mdinfo *ddf_activate_spare(struct active_array *a,
  			} else if (ddf->phys->entries[dl->pdnum].type &
  				   __cpu_to_be16(DDF_Global_Spare)) {
  				is_global = 1;
+			} else if (!(ddf->phys->entries[dl->pdnum].state &
+				     __cpu_to_be16(DDF_Failed))) {
+				/* we can possibly use some of this */
+				is_global = 1;
  			}
  			if ( ! (is_dedicated ||
  				(is_global && global_ok))) {

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux