On 05/01/2011 08:22 PM, NeilBrown wrote: > However if there is another layer in between md and the filesystem - such as > dm - then there can be problem. > There is no mechanism in the kernl for md to tell dm that things have > changed, so dm never changes its configuration to match any change in the > config of the md device. > > A filesystem always queries the config of the device as it prepares the > request. As this is not an 'active' query (i.e. it just looks at > variables, it doesn't call a function) there is no opportunity for dm to then > query md. Thanks for this followup, Neil. Just to clarify, it sounds like any one of the following situations on its own is *not* problematic from the kernel's perspective: 0) having a RAID array that is more often in a de-synced state than in an online state. 1) mixing various types of disk in a single RAID array (e.g. SSD and spinning metal) 2) mixing various disk access channels within a single RAID array (e.g. USB and SATA) 3) putting other block device layers (e.g. loopback, dm-crypt, dm (via lvm or otherwise) above md and below a filesystem 4) hot-adding a device to an active RAID array from which filesystems are mounted. However, having any layers between md and the filesystem becomes problematic if the array is re-synced while the filesystem is online, because the intermediate layer can't communicate $SOMETHING (what specifically?) from md to the kernel's filesystem code. As a workaround, would the following sequence of actions (perhaps impossible for any given machine's operational state) allow a RAID re-sync without the errors jrollins reports or requiring a reboot? a) unmount all filesystems which ultimately derive from the RAID array b) hot-add the device with mdadm c) re-mount the filesystems or would something else need to be done with lvm (or cryptsetup, or the loopback device) between steps b and c? Coming at it from another angle: is there a way that an admin can ensure that the RAID array can be re-synced without unmounting the filesystems other than limiting themselves to exactly the same models of hardware for all components in the storage chain? Alternately, Is there a way to manually inform a given mounted filesystem that it should change $SOMETHING (what?), so that an aware admin could keep filesystems online by issuing this instruction before a raid re-sync? From a modular-kernel perspective: Is this specifically a problem with md itself, or would it also be the case with other block-device layering in the kernel? For example, suppose an admin has (without md) lvm over a bare disk, and a filesystem mounted from an LV. The admin then adds a second bare disk as a PV to the VG, and uses pvmove to transfer the physical extents of the active filesystem to the new disk, while mounted. Assuming that the new disk doesn't have the same characteristics (which characteristics?), does the fact that LVM sits between the underlying disk and the filesystem cause the same problem? What if dm-crypt sits between the disk and lvm? Between lvm and the filesystem? What if the layering is disk-dm-md-fs instead of disk-md-dm-fs ? Sorry for all the questions without having much concrete to contribute at the moment. If these limitations are actually well-documented somewhere, I would be grateful for a pointer. As a systems administrator, i would be unhappy to be caught out by some as-yet-unknown constraints during a hardware failure. I'd like to at least know my constraints beforehand. Regards, --dkg
Attachment:
signature.asc
Description: OpenPGP digital signature