Re: More Hot Unplug/Plug work

Dan Williams <dan.j.williams@xxxxxxxxx> · Fri, 7 May 2010 18:06:48 -0700

On Sun, May 2, 2010 at 10:58 PM, Neil Brown <neilb@xxxxxxx> wrote:
> On Thu, 29 Apr 2010 14:55:23 -0700
> Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>> I am not grokking the separate POLICY line, especially for defining
>> the spare-migration border because that is already what DOMAIN is
>> specifying.
>
> Is it?  This is what I'm not yet 100% convinced about.
> We seem to be saying:
>  - A DOMAIN is a set of devices that are handled the same way for
>    hotplug
>  - A DOMAIN is a set of devices that define a boundary on spare
>    migration

The definition I have been carrying around is slightly more nuanced.
The DOMAIN defines a maximal boundary, but there might be metadata
specific modifiers that further restrict the possible actions.  For
example a DOMAIN path=ddf domain would handle all hotplug events on
"ddf" ports the same way with the caveat that the ddf handler would
know about controller spanning rules in the multi-controller case.
Otherwise if you define path=<pci-device-path+partitions> then
wysiwyg, i.e. no arrays assembling across these boundaries.

>
> and I'm not sure those sets are necessarily isomorphic - though I agree that
> they will often be the same.
>
> Does each DOMAIN line define a separate migration boundary so that devices
> cannot migrate 'across domains'??
> If we were to require that, I would probably want multiple 'path=' words
> allowed for a single domain so we can create a union.

Yes, we should do that regardless because it would be hard to write a
glob that covers disparate controllers otherwise.

>>
>> Here is what I think we need to allow for simple honoring of platform
>> constraints but without needing to expose all the nuances of those
>> constraints in config-file syntax... yet.
>>
>> 1/ Allow path= to take a metadata name this allows the handler to
>> identify its known controller ports, alleviating the user from needing
>> to track which ports are allowed, especially as it may change over
>> time.  If someone really wants to see which ports a metadata handler
>> cares about we could have a DOMAIN line dumped by --detail-platform
>> --brief -e imsm.  However for simplicity I would rather just dump:
>>
>> DOMAIN path=imsm action=spare-same-port spare-migration=imsm
>>
>
> So "path=imsm" means "all devices which are attached to a controller which
> seems to understand IMSM natively".
> What if a system had two such controllers - one on-board and one on a plug-in
> card.  This might not be possibly for IMSM but would be for DDF.
> I presume the default would be that the controllers are separate domains -
> would you agree?

The controllers may restrict spare migration but I would still see
this as one ddf DOMAIN where the paths and spare migration constraints
are internally determined by the handler, but the hotplug policy is
global for the "ddf-DOMAIN".

> So the above DOMAIN line would potentially create multiple
> 'domains' at least for spare-migration.

Yes.

>
>> 2/ I think we should always block configurations that cross domain
>> boundaries.  One can always append more path= lines to override this.
>
> I think we all agree on this.  Require --force to create an array, or add
> devices to an array, where that would cross an established spare-group...
> The details are still a bit vague for me but the principle is good.
>
>>
>> 3/ The metadata handler may want to restrict/control where spares are
>> placed in a domain.  To enable interaction with CIM we are looking to
>> add a storage-pool id to the metadata.  The primary usage of this will
>> be to essentially encode a spare-group number in the metadata.  This
>> seems to require a spare-migration= option to the DOMAIN line.  By
>> default it is 'all' but it can be set to a metadata-name to let the
>> handler apply its internal migration policy.
>
> I'm not following you.  Are you talking about subsets of a domain? Subdomains?
> Do the storage-pools follow hardware port locations, or dynamic configuration
> of individual devices (hence being recorded in metadata).

Dynamic configuration, but I would still call this the imsm-DOMAIN
with metadata specific spare-migration-boundaries.

>
> This is how I think spare migration should work:
>  Spare migration is controlled entirely by the 'spare-group' attribute.
>  A spare-group is an attribute of a device. A device may have multiple
>   spare-group attributes (it might be in multiple groups).
>  There are two ways a device can be assigned a spare-group.
>  1/ If an array is tagged with a spare-group= in mdadm.conf then any device
>    in that array gets that spare-group attribute
>  2/ If a DOMAIN is tagged with a spare-group attribute then any device
>    in that domain gets that spare-group attribute
>
>  When mdadm --monitor needs to find a hot spare for an array or container
>  which is degraded, it collects a list of spare-group attributes
>  for all devices in the array, then finds any device (of suitable size)
>  that has a spare-group attribute matching any of those.
>  Possibly a weighting should prefer spare-groups that are more prevalent in
>  the array, so that if you add a foreign device in an emergency, mdadm won't
>  feel too free to add other foreign devices (but is still allowed to).
>
>  You seem to be suggesting that the spare-group tag could also be specified
>  by the metadata.  I think I'm happy with that.

Yeah, metadata implied spare-groups that sub-divide the domain.

>
>  A DOMAIN line without an explicit spare-group= tag might imply an implicit
>  spare-group= tag where the spare-group name is some generated string that
>  is unique to that DOMAIN line.
>  So all devices in a DOMAIN line are effectively interchangeable, but it is
>  easy to stretch the migration barrier around multiple domains by giving
>  them all a matching spare-group tag.
>
> When you create an array, every pair of devices much share a spare-group, or
> else one of them must not be in an spare-group.  Is that right?

...once you allow for $metadata-DOMAINs I am having trouble
conceptualizing the use case for allowing spares to migrate across the
explicit union of path= boundaries?  Unless you are trying to codify
what the metadata handlers would be doing internally.  In which case I
would expect to replace a single spare-group= identifier with multiple
mutually exclusive spare-path= lines to subdivide a DOMAIN into spare
migration sub-domains with the same hot-plug policy.

...or am I still misunderstanding your spare-group= vs DOMAIN distinction?

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html