Re: [PATCH] FIX: Cannot continue reshape if incremental assembly is used

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 7, 2011 at 6:21 AM, Dorau, Lukasz <lukasz.dorau@xxxxxxxxx> wrote:
> On Wed, Sep 07, 2011 4:38 AM Neil Brown <neilb@xxxxxxx> wrote:
>>
>> On Tue, 6 Sep 2011 14:34:42 -0700 Dan Williams <dan.j.williams@xxxxxxxxx>
>> wrote:
>>
>> > On Thu, Sep 1, 2011 at 6:18 AM, Lukasz Dorau <lukasz.dorau@xxxxxxxxx>
>> wrote:
>> > > Description of the bug:
>> > > Interrupted reshape cannot be continued using incremental assembly.
>> > > Array becomes inactive.
>> > >
>> > > Cause of the bug:
>> > > Reshape tried to continue with insufficient number of disks
>> > > added by incremental assembly (tested using capacity expansion).
>> > >
>> > > Solution:
>> > > During reshape adding disks to array should be blocked until
>> > > minimum required number of disks is ready to be added.
>> >
>> > Can you provide a script test-case to reproduce the problem?
>>
>> I can:
>>
>> mdadm -C /dev/md/imsm -e imsm -n 4 /dev/sd[abcd]
>> mdadm -C /dev/md/r5 -n3 -l5 /dev/md/imsm -z 2000000
>> mdadm --wait /dev/md/r5
>> mdadm -G /dev/md/imsm -n4
>> sleep 10
>> mdadm -Ss
>> mdadm -I /dev/sda
>> mdadm -I /dev/sdb
>> mdadm -I /dev/sdc
>>
>> array is started and reshape continues.
>>
>> The problem is that container_content reports that array.working_disks is 3
>> rather than 4.
>> 'working_disks' should be the number of disks int the array that were working
>> last time
>> the array was assembled.

Hmm, this might just be cribbed from the initial DDF implementation,
should be straightforward to reuse the count we use for
container_enough, but I'm not seeing where Incremental uses
working_disks for external arrays...

>> However the imsm code only counts devices that can currently be found.
>> I'm not familiar enough with the IMSM metadata to fix this.
>> However by looking at the metadata on just one device in an array it should be
>> possible
>> to work out how many were working last time, and report that count.
>>
>
> Neil, please consider the following script test-case (not 4 but 5 drives finally in the array):
>
> mdadm -C /dev/md/imsm -e imsm -n 5 /dev/sd[abcde]
> mdadm -C /dev/md/r5 -n3 -l5 /dev/md/imsm -z 2000000
> mdadm --wait /dev/md/r5
> mdadm -G /dev/md/imsm -n5
> sleep 10
> mdadm -Ss
> mdadm -I /dev/sda
> mdadm -I /dev/sdb
> mdadm -I /dev/sdc
> # array is not started and reshape does not continue!
> mdadm -I /dev/sdd
>
> and now array is started and reshape continues - the minimum required number of disks is added to array already.
>
> So the question is:  when mdadm should start the array using incremental assembly?:

As soon as all drives are present, or when the minimum number is
present and --run is specified.

> 1) when minimum required number of disks is added and (degraded) array can be started or
> 2) when all disks that were working last time the array was assembled are added.

This is what ->container_enough attempts to identify, and it looks
like you are running into the fact that it does not take into account
migration.  imsm_count_failed() is returning the wrong value, and it
has the comment:

        /* FIXME add support for online capacity expansion and
         * raid-level-migration
         */
The routine in getinfo_super_imsm should also be looking at map0,
currently it is looking at map1 to determine the number of device
members.

> If the second is true, there is another question: when to decide to give up waiting for non-present disks that can be (e.g.) removed meanwhile by user?

Not really mdadm's problem.  That's primarily up to the udev policy.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux