On Wed, Sep 7, 2011 at 6:21 AM, Dorau, Lukasz <lukasz.dorau@xxxxxxxxx> wrote: > On Wed, Sep 07, 2011 4:38 AM Neil Brown <neilb@xxxxxxx> wrote: >> >> On Tue, 6 Sep 2011 14:34:42 -0700 Dan Williams <dan.j.williams@xxxxxxxxx> >> wrote: >> >> > On Thu, Sep 1, 2011 at 6:18 AM, Lukasz Dorau <lukasz.dorau@xxxxxxxxx> >> wrote: >> > > Description of the bug: >> > > Interrupted reshape cannot be continued using incremental assembly. >> > > Array becomes inactive. >> > > >> > > Cause of the bug: >> > > Reshape tried to continue with insufficient number of disks >> > > added by incremental assembly (tested using capacity expansion). >> > > >> > > Solution: >> > > During reshape adding disks to array should be blocked until >> > > minimum required number of disks is ready to be added. >> > >> > Can you provide a script test-case to reproduce the problem? >> >> I can: >> >> mdadm -C /dev/md/imsm -e imsm -n 4 /dev/sd[abcd] >> mdadm -C /dev/md/r5 -n3 -l5 /dev/md/imsm -z 2000000 >> mdadm --wait /dev/md/r5 >> mdadm -G /dev/md/imsm -n4 >> sleep 10 >> mdadm -Ss >> mdadm -I /dev/sda >> mdadm -I /dev/sdb >> mdadm -I /dev/sdc >> >> array is started and reshape continues. >> >> The problem is that container_content reports that array.working_disks is 3 >> rather than 4. >> 'working_disks' should be the number of disks int the array that were working >> last time >> the array was assembled. Hmm, this might just be cribbed from the initial DDF implementation, should be straightforward to reuse the count we use for container_enough, but I'm not seeing where Incremental uses working_disks for external arrays... >> However the imsm code only counts devices that can currently be found. >> I'm not familiar enough with the IMSM metadata to fix this. >> However by looking at the metadata on just one device in an array it should be >> possible >> to work out how many were working last time, and report that count. >> > > Neil, please consider the following script test-case (not 4 but 5 drives finally in the array): > > mdadm -C /dev/md/imsm -e imsm -n 5 /dev/sd[abcde] > mdadm -C /dev/md/r5 -n3 -l5 /dev/md/imsm -z 2000000 > mdadm --wait /dev/md/r5 > mdadm -G /dev/md/imsm -n5 > sleep 10 > mdadm -Ss > mdadm -I /dev/sda > mdadm -I /dev/sdb > mdadm -I /dev/sdc > # array is not started and reshape does not continue! > mdadm -I /dev/sdd > > and now array is started and reshape continues - the minimum required number of disks is added to array already. > > So the question is: when mdadm should start the array using incremental assembly?: As soon as all drives are present, or when the minimum number is present and --run is specified. > 1) when minimum required number of disks is added and (degraded) array can be started or > 2) when all disks that were working last time the array was assembled are added. This is what ->container_enough attempts to identify, and it looks like you are running into the fact that it does not take into account migration. imsm_count_failed() is returning the wrong value, and it has the comment: /* FIXME add support for online capacity expansion and * raid-level-migration */ The routine in getinfo_super_imsm should also be looking at map0, currently it is looking at map1 to determine the number of device members. > If the second is true, there is another question: when to decide to give up waiting for non-present disks that can be (e.g.) removed meanwhile by user? Not really mdadm's problem. That's primarily up to the udev policy. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html