Hi, On 4/17/20 9:42 AM, Michael Stapelberg wrote: > Hey, > > I’m starting to use LVM (+LUKS) on a computer of mine, but ran into > trouble getting it to work. > > The issue I’m running into is that systemd boot hangs until the > default unit timeout elapses. This is because the cryptroot device is > not found, which in turn is because udev doesn’t create the symlinks > (e.g. in /dev/disk/by-uuid). udevadm info shows: > > # udevadm info -p /sys/block/dm-0 > P: /devices/virtual/block/dm-0 > N: dm-0 > L: 0 > E: DEVPATH=/devices/virtual/block/dm-0 > E: DEVNAME=/dev/dm-0 > E: DEVTYPE=disk > E: MAJOR=254 > E: MINOR=0 > E: SUBSYSTEM=block > E: USEC_INITIALIZED=6522555 > E: DM_UDEV_DISABLE_SUBSYSTEM_RULES_FLAG=1 > E: DM_UDEV_DISABLE_DISK_RULES_FLAG=1 > E: DM_UDEV_DISABLE_OTHER_RULES_FLAG=1 > E: SYSTEMD_READY=0 > E: TAGS=:systemd: > > I pinpointed this result to udev rule > https://sourceware.org/git/?p=lvm2.git;a=blob;f=udev/10-dm.rules.in;hb=ecae76c713bd4fa6c9d8f2a2c990625e4f38b504#l87, > i.e.: > ENV{DM_UDEV_RULES_VSN}!="1", ENV{DM_UDEV_PRIMARY_SOURCE_FLAG}!="1", > GOTO="dm_disable" > > I assume I’m running into this rule because I’m using a custom initrd > which does not run systemd nor udev. Instead, my initrd is directly > calling vgchange -ay and vgmknodes. > > I understand that this is not a common setup, but booting without > systemd/udev in the initrd should be supported, no? > You hit the painful spot here! Unfortunately, we don't support this case with existing rules. It's not that we wouldn't like to see this case supported, but the issue is in recognition of the uevents. To answer why in a way it makes sense, I need to be a little bit wordy here, sorry for that in advance... Device-mapper device activation consists of three steps for which different uevents are generated: - DM device creation (ADD uevent) - DM table load (no uevent) - DM device resume which also activates the mapping as described by the table (CHANGE uevent) Right after the first step (with the ADD uevent), the device is not usable yet, obviously, because it has no table loaded yet. So we need to make sure that no udev rule causes this device to be accessed at this point in time. One of the elementary udev rule is a call to "blkid" which scans the device and extracts metadata information based on which the /dev/disk/by-* content is created and other udev rules can act further based on the information. That's why we need to postpone this device access within udev rule processing up until we're sure the device is ready, that is, after the CHANGE uevent when the table is made active. On the contra, we have coldplugging (calling "udevadm trigger --action=add"). At boot, coldplugging is used to make up for all the devices that have been activated before udevd is started from root fs (to make udevd conscious about those devices which were handled inside initrd). These "coldplug uevents" are in essence unrecognizable from other ADD uevents - there's no mark or flag saying this uevent is coming from the coldplug. And that is exactly the problematic part - we don't know whether this is the coldplug's ADD uevent AFTER we did the proper activation sequence or if this is spurious ADD uevent that comes before the device is properly activated. We simply don't know. To alleviate this problem, when a DM device is being activated, that is, libdevmapper in userspace calls create + table load + device resume sequence, it also provides the DM_UDEV_PRIMARY_SOURCE_FLAG=1 so that it is attached to the "resume device" call (...then this flag appears in the uevent the "resume device" call causes inside kernel). Once we have uevents with this flag set, it is stored in udev database. When we're processing any other subsequent uevent, we know we have already passed this activation sequence correctly. This also applies for processing any "coldplug uevents" - we simply look at the udev database content and if it has that flag set (that's exactly the IMPORT{db}=DM_UDEV_PRIMARY_SOURCE_FLAG call that you can also see in 10-dm.rules), we know we can just rerun udev rules for such uevents as the device has already gone through the activation sequence properly. Now, if we have initrd completely without udev and then switching over to root fs where we have udevd running, we're getting into the problem you are hitting here: - device is activated in initrd without udev (so we have no udev db record about this device) - switching over to root fs - running udevd - running coldplug (udevadm trigger --action=add) - udev rules reacting to coldplug uevents - 10-dm.rules trying to import the DM_UDEV_PRIMARY_SOURCE_FLAG, but since there was no udevd to record this information inside inird, we conclude the device has not yet passed activation sequence correctly and this is just a spurious uevent, hence ignoring it - and that's exactly what you see. You can also simulate this problem by executing: - udevadm info --cleanup-db - udevadm trigger --action=add ...which gets you into exactly the same situation (do that only on a test system :) ). However... When it comes to improving uevent recognition, there's a kernel patch I did back in 2017 which adds SYNTH_UUID (and other possible SYNTH_* variables) to synthetic/coldplug uevents: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f36776fafbaa0094390dd4e7e3e29805e0b82730 There are also userspace patches for systemd/udevd (which still need some cherishing before systemd guys take that): https://github.com/systemd/systemd/pull/13881 With this in, we could be in a better position to fix udev rules too. > I’m not sure where DM_UDEV_PRIMARY_SOURCE_FLAG is supposed to be set, > or why it isn’t set in my scenario. Do you have any ideas regarding > what I could check? > As described above, it's set by libdevmapper, then libdevmapper passing that through DM ioctl to kernel, then kernel generating uevent with this flag, then udevd receiving the uevent with this flag set. Any subsequent uevents reimport this flag from existing udev database records. > Thanks in advance, > Best regards, > Michael > > PS: As a workaround, I’m just commenting out that rule. Does that have > any negative consequences? > Yes, there's a race because of the 3 step sequence to activate a DM device. With commenting out that rule, you make it possible to access a DM device where the table is not yet loaded and made active (hence unusable device). If you're lucky, when the ADD event is being processed, the "load table + resume" part could have already executed because it takes some time for udevd to react to uevents, but it doesn't need to be always the case. If you're not lucky, you can get non-deterministic behavior (the blkid scan will fail, various other records in udev may be set based on that incorrectly etc.). -- Peter _______________________________________________ linux-lvm mailing list linux-lvm@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/