Hi Zdenek, Peter, David, Ben, I have been pondering the device-mapper udev rules a lot lately, and I believe I have found glitches in the logic of the device mapper udev flags that I'd like to bring to your attention. TL;DR: change I propose: * use DM_DISABLE_OTHER_RULES_FLAG consistently as a flag meaning "upper layers should leave this device alone until told otherwise", saved between uevents in the udev db * use DM_SUSPENDED consistently as a flag meaning "upper layers should keep their hands off this device temporarily", not saved in the udev db. * don't use any other flags in upper layers, including 13-dm- disk.rules. This implies: * stop setting DM_DISABLE_OTHER_RULES_FLAG from DM_SUSPENDED in 10-dm.rules * check DM_DISABLE_OTHER_RULES_FLAG instead of DM_NOSCAN in 13-dm-disk.rules * check DM_SUSPENDED in 69-dm-lvm.rules, 80-udisks2.rules * stop using DM_NOSCAN in 11-dm-mpath.rules and 66-kpartx.rules Full story, I apologize for the lengthy post: # DM_DISABLE_OTHER_RULES_FLAG This flag serves multiple purposes: 1) it is set by LVM itself directly for certain types of logical volumes, such as thin pools, and passed through to the udev rules using DM_COOKIE. 2) it is set by 10-dm.rules if the device in question is found suspended (DM_SUSPENDED=1), or if DISK_RO is set. 3) For spurious events, 10-dm.rules restores it from the udev db. 4) it set by 11-lvm.rules if LVM's internal "noscan" was set (DM_SUBSYSTEM_UDEV_FLAG0). This applies only for actual LVs, not generic DM devices. On the first subsequent event that has DM_SUBSYSTEM_UDEV_FLAG0 cleared (usually right the next event), DM_DISABLE_OTHER_RULES_FLAG is restored from DM_UDEV_DISABLE_OTHER_RULES_FLAG_OLD. 5) It's used by the 11-dm-mpath rules in a similar way. MPATH_DEVICE_READY=0 implies DM_DISABLE_OTHER_RULES_FLAG=1, but not vice-versa. My latest multipath patch series changes the treatment of the DM_DISABLE_OTHER_RULES_FLAG by not saving the override in the db any more. 6) It is consumed by later rules as sort a generic "don't touch this device" flag. IMO the fact that the same flag is used for various different conditions is problematic, in particular because the value of the flag is saved in the udev db between uevents. The flag can only be cleared if a genuine libdm event arrives that has this flag cleared, while the device is not suspended (and DISK_RO is not set, which is never the case for genuine libdm events, AFAICT). Saving the flag is probably correct for cases in which the flag has been set via the cookie (I'm not LVM expert enough to tell with certainty). But saving it if it has been set from DM_SUSPENDED seems wrong, because DM_SUSPENDED is tested anew for every new uevent, but DM_UDEV_DISABLE_OTHER_RULES_FLAG is not cleared if the device is not found suspended. Therefore I think the two cases 1) and 2) above must be differentiated. DISK_RO is yet different; I have to say I am unsure under what circumstances this flag is set at all. It doesn't seem to be set for a read-only LVM LV, for example. The kernel sets it only when the disk's "ro" attribute is _toggled_, and no udev rule imports it from the db, causing it to be temporary at best. # DM_NOSCAN This flag is also used in different ways. - In 11-dm-lvm.rules, it doesn't seem to be meant for later rules to consume. Rather, it is used to remember the fact that the previous uevent had DM_SUBSYSTEM_UDEV_FLAG0 (aka LVM "noscan") flag set. It's only important from DB if DM_SUBSYSTEM_UDEV_FLAG0 is clear, and then cleared immediately. In later rules, the semantics of DM_NOSCAN is "DM_SUBSYSTEM_UDEV_FLAG0 was set in this event", and not "this device is not accessible, don't attempt IO". 11-dm-lvm.rules sets DM_DISABLE_OTHER_RULES_FLAG to indicate non-accessibility. - The modification of DM_DISABLE_OTHER_RULES_FLAG seems wrong if DM_DISABLE_OTHER_RULES_FLAG is set from DM_SUSPENDED, but if we drop that behavior as described above, I think we can leave at that. - But that would mean that later rules should actually evaluate DM_DISABLE_OTHER_RULES_FLAG and not DM_NOSCAN, unless they want the precise semantics described above (which doesn't make much sense for layers further up the stack IMO) - 11-dm-mpath.rules sets different semantics for DM_NOSCAN. It basically sets DM_NOSCAN=1 equivalent to MPATH_DEVICE_READY=0; moreover, MPATH_DEVICE_READY=0 implies DM_DISABLE_OTHER_RULES_FLAG=1 - The consumers of DM_NOSCAN assume "don't attempt IO" semantics. # Usage on upper layers The only condition that is relevant to layers above LVM/multipath is whether they are allowed to attempt I/O on the device and read meta data to derive further properties, create symlinks, activate device etc. We currently have 3 properties with vaguely these semantics: - DM_DISABLE_OTHER_RULES_FLAG (consumed in 66-kpartx.rules, 69-dm-lvm.rules, 80-udisks2.rules, 99-systemd.rules) - DM_NOSCAN (consumed in 13-dm-disk.rules, 66-kpartx.rules) - DM_SUSPENDED (13-dm-disk.rules, 66-kpartx.rules, 99-systemd.rules). These properties differ in the way they are remembered between uevents, but to upper layers they have a similar meaning. However, they are not equivalent. AFAICS, DM_NOSCAN=1 implies DM_DISABLE_OTHER_RULES_FLAG=1. I can't think of a situation where it would be beneficial to attempt IO with DM_DISABLE_OTHER_RULES_FLAG=1 and DM_NOSCAN!=1. So basically upper layer should simply ignore DM_NOSCAN. That also means that 11-dm-mpath.rules doesn't need to touch DM_NOSCAN any more. DM_SUSPENDED is different, because it's temporary condition that is re- evaluated on every uevent. Also, it may have different implications; see e.g. 99-systemd.rules. Upper layers should use the condition DM_DISABLE_OTHER_RULES_FLAG=1 || DM_SUSPENDED=1 to check if IO is impossible on a device. That means that we should not set DM_DISABLE_OTHER_RULES_FLAG=1 from DM_SUSPENDED=1; we should treat these two as independent conditions. I'd find it somewhat cleaner if we didn't mess with DM_DISABLE_OTHER_RULES_FLAG in 11-dm-lvm.rules and 11-dm-mpath.rules at all. This way the flag could preserve exactly the meaning it had been given by the sender of the last cookie. But that would mean adding new properties and making appropriate changes in upper layers, which we don't control. I am not sure if we want to do that. I observe that in the lvm2 C code, DM_DISABLE_OTHER_RULES_FLAG is equivalent to DM_DISABLE_DISK_RULES_FLAG, so the latter flag may give us the cookie value, if we agree to keep this behavior in the future. Regards Martin