On Mon, Mar 29, 2010 at 4:30 PM, Doug Ledford <dledford@xxxxxxxxxx> wrote: > On 03/29/2010 05:36 PM, Dan Williams wrote: >> I agree once you have a DOMAIN you implicitly have a spare-group. So >> DOMAIN would supersede the existing spare-group identifier in the >> ARRAY line and cause mdadm --monitor to auto-migrate spares between >> 0.90 and 1.x metadata arrays in the same DOMAIN. For the imsm case >> the expectation is that spares migrate between containers regardless >> of the DOMAIN line as that is what the implementation expects. > > Give me some clearer explanation here because I think you and I are > using terms differently and so I want to make sure I have things right. > My understanding of imsm raid containers is that all the drives that > belong to a single option rom, as long as they aren't listed as jbod in > the option rom setup, belong to the same container. I think the disconnect in the imsm case is that the container to DOMAIN relationship is N:1, not 1:1. The mdadm notion of an imsm-container correlates directly with a 'family' in the imsm metadata. The rules of a family are: 1/ All family members must be a member of all defined volumes. For example with a 4-drive container you could not simultaneously have a 4-drive (sd[abcd]) raid10 and a 2-drive (sd[ab]) raid1 volume because any volume would need to incorporate all 4 disks. Also, per the rules if you create two raid1 volumes sd[ab] and sd[cd] those would show up as two containers. 2/ A spare drive does not belong to any particular family ('family_number' is undefined for a spare). The Windows driver will automatically use a spare to fix any degraded family in the system. In the mdadm/mdmon case since we break families into containers we need a mechanism to migrate spare devices between containers because they are equally valid hot spare candidate for any imsm container in the system. > That container is > then split up into various chunks and that's where you get logical > volumes. I know there are odd rules for logical volumes inside a > container, but I think those are mostly irrelevant to this discussion. > So, when I think of a domain for imsm, I think of all the sata ports or > sas ports under a single option rom. From that perspective, spares can > *not* move between domains as a spare on a sas port can't be added to a > sata option rom container array. I was under the impression that if you > had, say, a 6 port sata controller option rom, you couldn't have the > first three ports be one container and the next three ports be another > container. Is that impression wrong? Yes, we can have exactly this situation. This begs the question, why not change the definition of an imsm container to incorporate anything with imsm metadata? This definitely would make spare management easier. This was an early design decision and had the nice side effect that it lined up naturally with the failure and rebuild boundaries of a family. I could give it more thought, but right now I believe there is a lot riding on this 1:1 container-to-family relationship, and I would rather not go there. > However, that just means (to me anyway) that I would treat all of the > sata ports as one domain with multiple container arrays in that domain > just like we can have multiple native md arrays in a domain. If a disk > dies and we hot plug a new one, then mdadm would look for the degraded > container present in the domain and add the spare to it. It would then > be up to mdmon to determine what logical volumes are currently degraded > and slice up the new drive to work as spares for those degraded logical > volumes. Does this sound correct to you, and can mdmon do that already > or will this need to be added? This sounds correct, and no mdmon cannot do this today. The current discussions we (Marcin and I) had with Neil offlist was extending mdadm --monitor to handle spare migration for containers since it already handles spare migration for native md arrays. It will need some mdmon coordination since mdmon is the only agent that can disambiguate a spare from a stale device at any given point in time. >> However this is where we get into questions of DOMAIN conflicting with >> 'platform' expectations, under what conditions, if any, should DOMAIN >> be allowed to conflict/override the platform constraint? Currently >> there is an environment variable IMSM_NO_PLATFORM, do we also need a >> configuration op > > I'm not sure I would ever allow breaking valid platform limitations. I > think if you want to break platform limitations, then you need to use > native md raid arrays and not imsm/ddf. It seems to me that if you > allow the creation of an imsm/ddf array that the BIOS can't work with > then you've potentially opened an entire can of worms we don't want to > open about expectations that the BIOS will be able to work with things > but can't. If you force native arrays as the only type that can break > platform limitations, then you are at least perfectly clear with the > user that the BIOS can't do what the user wants. Agreed. -- Dan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html