Hi Neil,
as a first test I can confirm that this fixes the problem with the
layered md devices in a container.
So far so good on this.
Thanks,
regards,
Albert
On 08/15/2012 01:43 AM, NeilBrown wrote:
On Wed, 1 Aug 2012 19:52:51 +0200 Albert Pauw <albert.pauw@xxxxxxxxx> wrote:
Hi Neil,
found another bug.
- Created a container with six disks
- Created two md devices in it:
mdadm -CR /dev/md0 -l 6 -n 6 -z 50M
mdadm -CR /dev/md1 -l 5 -n 6 -z 50M
The md devices are "layered" in the container across all disks.
They both get build and are online.
- Fail one disk, both md devices are affected
- Remove disk
- Clear superblock of removed disk
- Add disk again (in essence, I just added a spare disk)
Now comes the error:
- md0 is rebuild
- md1 is NOT rebuild
The reason for this is somewhat messy.
mdadm will currently only add a 'spare' device to an array which needs a
replacement device.
In DDF the whole device is either 'active' or 'spare'. There isn't a concept
of 'partly active, partly spare'.
So when mdadm adds part of the disk to one array it stops being spare and
started being active. So when mdadm looks for a spare to add to the second
array, there are no spare devices.
I can hack around it by allowing any non-failed device to be considered as a
spare but I need to find a better solution. That might take a while. I've
made a note on my to-do list, but it is a rather long list.
Thanks,
NeilBrown
diff --git a/super-ddf.c b/super-ddf.c
index d006a04..11b98f7 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -2616,7 +2616,7 @@ static int validate_geometry_ddf(struct supertype *st,
if (chunk && *chunk == UnSet)
*chunk = DEFAULT_CHUNK;
-
+ if (level == -1000000) level = LEVEL_CONTAINER;
if (level == LEVEL_CONTAINER) {
/* Must be a fresh device to add to a container */
return validate_geometry_ddf_container(st, level, layout,
@@ -3701,6 +3701,10 @@ static struct mdinfo *ddf_activate_spare(struct active_array *a,
} else if (ddf->phys->entries[dl->pdnum].type &
__cpu_to_be16(DDF_Global_Spare)) {
is_global = 1;
+ } else if (!(ddf->phys->entries[dl->pdnum].state &
+ __cpu_to_be16(DDF_Failed))) {
+ /* we can possibly use some of this */
+ is_global = 1;
}
if ( ! (is_dedicated ||
(is_global && global_ok))) {
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html