>> There is a certain amount of metadata that -must- be updated at runtime, >> as you recognize. Over and above what MD already cares about, DDF and >> its cousins introduce more items along those lines: event logs, bad >> sector logs, controller-level metadata... these are some of the areas I >> think Justin/Scott are concerned about. > > I'm sure these things could be accommodated within DM. Nothing in DM prevents > having some sort of in-kernel metadata knowledge. In fact, other DM modules > already do - dm-snapshot and the above mentioned dm-mirror both need to do > some amount of in-kernel status updating. But I see this as completely > separate from in-kernel device discovery (which we seem to agree is the wrong > direction). And IMO, well designed metadata will make this "split" very > obvious, so it's clear which parts of the metadata the kernel can use for > status, and which parts are purely for identification (which the kernel thus > ought to be able to ignore). We don't have control over the meta-data formats being used by the industry. Coming up with a solution that only works for "Linux Engineered Meta-data formats" removes any possibility of supporting things like DDF, Adaptec ASR, and a host of other meta-data formats that can be plugged into things like EMD. In the two cases we are supporting today with EMD, the records required for doing discovery reside in the same sectors as those that need to be updated at runtime from some "in-core" context. > The main point I'm trying to get across here is that DM provides a simple yet > extensible kernel framework for a variety of storage management tasks, > including a lot more than just RAID. I think it would be a huge benefit for > the RAID drivers to make use of this framework to provide functionality > beyond what is currently available. DM is a transform layer that has the ability to pause I/O while that transform is updated from userland. That's all it provides. As such, it is perfectly suited to some types of logical volume management applications. But that is as far as it goes. It does not have any support for doing "sync/resync/scrub" type operations or any generic support for doing anything with meta-data. In all of the examples you have presented so far, you have not explained how this part of the equation is handled. Sure, adding a member to a RAID1 is trivial. Just pause the I/O, update the transform, and let it go. Unfortunately, that new member is not in sync with the rest. The transform must be aware of this and only trust the member below the sync mark. How is this information communicated to the transform? Who updates the sync mark? Who copies the data to the new member while guaranteeing that an in-flight write does not occur to the area being synced? If you intend to add all of this to DM, then it is no longer any "simpler" or more extensible than EMD. Don't take my arguments the wrong way. I believe that DM is useful for what it was designed for: LVM. It does not, however, provide the machinery required for it to replace a generic RAID stack. Could you merge a RAID stack into DM. Sure. Its only software. But for it to be robust, the same types of operations MD/EMD perform in kernel space will have to be done there too. The simplicity of DM is part of why it is compelling. My belief is that merging RAID into DM will compromise this simplicity and divert DM from what it was designed to do - provide LVM transforms. As for RAID discovery, this is the trivial portion of RAID. For an extra 10% or less of code in a meta-data module, you get RAID discovery. You also get a single point of access to the meta-data, avoid duplicated code, and complex kernel/user interfaces. There seems to be a consistent feeling that it is worth compromising all of these benefits just to push this 10% of the meta-data handling code out of the kernel (and inflate it by 5 or 6 X duplicating code already in the kernel). Where are the benefits of this userland approach? -- Justin - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html