Re: Auto Rebuild on hot-plug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/26/2010 08:37 PM, Dan Williams wrote:
> On Thu, Mar 25, 2010 at 8:04 AM, Labun, Marcin <Marcin.Labun@xxxxxxxxx> wrote:
>> I think that metadata keyword can be used to identify scope of devices to which the DOMAIN line applies.
>> For instance we could have:
>> DOMAIN path=glob-pattern metadata=imsm hotplug=mode1  spare-group=name1
>> DOMAIN path=glob-pattern metadata=0.90 hotplug=mode2  spare-group=name2
>>
>> Keywords:
>> Path, metadata and spare-group shall define to which arrays the hotplug definition (or other definition of action) applies. User could define any subset of it.
>> For instance to define that all imsm arrays shall use hotplug mode2 user shall define:
>> DOMAIN metadata=imsm hotplug=mode2
>>
>> In above example user need not define spare-group in his/her configuration file for each array.
>>
>> I also assume that each metadata handler can additionally sets its own rules of accepting the spare in the container. Rules can be derived from platform dependencies or metadata. Notice that user can disable platform specific constrains by defining IMSM_NO_PLATFORM environment variable.
>>
> 
> For the 'platform' case we could automate some decisions, but I think
> I would rather extend the --detail-platform option to dump the
> recommended/compatible DOMAIN entries for the platform, perhaps via
> the --brief modifier.  This mirrors what can be done with --examine
> --brief to generate an initial configuration file that can be modified
> to taste.

So, a few things that I think can be said about the DOMAIN line type
(I'm assuming for now that this is what we'll use, mainly because I'm
implementing it right now):

There is an assumed, default DOMAIN line that is the equivalent of:

DOMAIN path=* metadata=* action=incremental spare-group=<none>

This is what you get simply by normal udev incremental assembly rules
(notice I used action instead of hotplug, action makes more sense to me
as all the words we use to define hotplug mode are in fact actions to
take on hotplug).  We will treat this as a given.  Anything else
requires an explicit DOMAIN line in mdadm.conf.

The second thing I'm having a hard time with is the spare-group.  To be
honest, if I follow what I think I should, and make it a hard
requirement that any action other than none and incremental must use a
non-global path glob (aka, path= MUST be present and can not be *), then
spare-group looses all meaning.  I say this because if a disk matches
the path glob is it in a specific spare group already (the one that this
DOMAIN represents) and ditto if arrays are on disks in this DOMAIN, then
they are automatically part of the same spare-group.  In other words, I
think spare-group becomes entirely redundant once we have a DOMAIN keyword.

I'm also having a hard time justifying the existence of the metadata
keyword.  The reason is that the metadata is already determined for us
by the path glob.  Specifically, if we assume that an array's members
can not cross domain boundaries (a reasonable requirement in my opinion,
we can't make an array where we can guarantee to the user that hot
plugging a replacement disk will do what they expect if some of the
array's members are inside the domain and some are outside the domain),
then we should only ever need the metadata keyword if we are mixing
metadata types within this domain.  Well, we can always narrow down the
domain if we are doing something like the first three sata disks on an
Intel Matrix RAID controller as imsm and the last three as jbod with
version 1.x metadata by putting the first half in one domain and the
second half in another.  And this would be the right thing to do versus
trying to cover both in one domain.  That means that only if we ever
mixed imsm/ddf and md native raid types on a single disk would we be
unable to narrow down the domain properly, and I'm not sure we care to
support this.  So, that leaves us back to not really needing the
metadata keyword as the disks present in the path spec glob should be
uniform in the metadata type and we should be able to simply use the
right metadata from that.

>>>   hotplug modes are:
>>>     none  - ignore any hotplugged device
>>>     incr  - normal incremental assembly (the default).  If the device has
>>>          metadata that matches an array, try to add it to the array
>>>     replace - If above fails and a device was recently removed from this
>>>          same path, add this device to the same array(s) that the old
>>> devices
>>>          was part of
>>>     include - If the above fails and the device has not recognisable
>>> metadata
>>>          add it to any array/container that uses devices in this domain,
>>>          partitioning first if necessary.
>>>     force - as above but ignore any pre-existing metadata
>>>
>>>
>>>   I'm not sure that all those are needed, or are the best names.  Names
>>> like
>>>     ignore, reattach, rebuild, rebuild_spare
>>>   have also been suggested.
>>
>> Please consider:
>>      spare_add - add any spare device that matches the metadata container/volume in case of native metadata regardless of array state, so later such a spare can be used in rebuild process.
> 
> This is the same as 'incr' above.  If the device has metadata and
> hotplug is enabled, auto-incorporate the device.

So my preferred and suggest words for the action item are as follows
(Note: there are two classes of actions, things we do when presented
with a disk and we have a degraded array, and things we do when
presented with a disk and all arrays in domain are fully up to date,
which implies this is a new disk in the domain and not replacing a
faulty disk in the domain, which implies the domain wasn't previously
full up...it might be worth having two keywords in the DOMAIN line to
separate these two items, but I'm going to argue a bit later that we
really don't care about the second option and so maybe not):

none
incremental - what we have now, and the default
readd - if incremental didn't work but the device is supposed to be part
of the array, then attempt the --re-add option of mdadm, this would
allow a sysadmin to unplug and replug a device from an array if it got
kicked for some reason and the system would attempt to reinsert it into
the array with minimal rebuild, but it would not attempt to use any
device that was hot plugged that didn't previously belong to the array
safe_use - if the new drive is currently bare and we have a degraded
array, assume this drive is intended to repair the degraded array and
use the device
force_use - as above but don't require the drive be empty

All of the above actions are related to domains that are degraded.  But
what to do if the array isn't degraded?  We could add the device as a
spare, but if the array isn't degraded, adding a new hot spare doesn't
really *do* anything.  No rebuild will start, nothing immediate happens,
it just goes in and sits there.  And now that we have all these fancy
grow options, it's not entirely clear that a user would want that
anyway.  So, I would argue that if the array isn't degraded, then there
is no sense of emergency in our actions, and there exists multiple
options for what to do with the device, some include being a hot spare
while others include using the device to grow the array, and the
possibilities and answers to what to do here are not at all clear.  Even
if the user had previously configured us to treat the device as a spare,
they may change their mind and want to grow things.  Given that there's
no immediate need to do anything as there aren't any degraded arrays, I
say let the user do whatever they want and don't try to do anything
automatically as it seems likely to me that the user's wants in this
area are likely to change from time to time based on circumstances and
having them update the config file prior to inserting the device is more
klunky than just telling them to do whatever they want themselves after
inserting the device.


>> Can we assume for all external metadata that spares added any container can be potentially moved between all container the same metadata?
> 
> Yes, that can be the default action, and the spare-group keyword can
> be specified to override.

Or as I mentioned earlier, two domains with different path globs gets
you this without having to use the spare-group keyword.  For instead,
you can put the sata ports on one domain path and the sas ports on
another domain path as the bios won't allow containers to cross that
boundary and that is sufficient to make us handle hot plugged drives
properly when both are in use.  I really don't see the use of the
spare-group keyword, the path glob should be sufficient.


-- 
Doug Ledford <dledford@xxxxxxxxxx>
              GPG KeyID: CFBFF194
	      http://people.redhat.com/dledford

Infiniband specific RPMs available at
	      http://people.redhat.com/dledford/Infiniband

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux