On Thu, 2024-02-08 at 19:30 -0500, Benjamin Marzinski wrote: > On Tue, Feb 06, 2024 at 11:50:46AM +0100, Martin Wilck wrote: > > > > Unfortunately, testing of my patch series has shown that it's an > > improvement, but it isn't sufficient for completely avoiding races > > between multipathd and udev. DM_SUSPENDED=1 can be avoided, but > > it's > > still possible that the device gets suspended after the udev rules > > test > > for supended state and before they run kpartx, blkid, pvscan, or > > other > > similar commands. > > > > I am quite clueless about this right now, and would be grateful for > > ideas. Re-implementing LOCK_EX locking would be a possibility for > > systemd >= 250, as noted in the cover letter of the series. But for > > systemd <= 249, I don't see good options. We can't use LOCK_EX, > > because > > when we release the lock, we have no means to know whether or not a > > race with udev occurred (iow, whether udev tried to access the > > device > > while we held the lock, failed, and dropped the event). Thus we'd > > need > > to assume that this was the case, and force a reload after each > > reload, > > which is obvious nonsense. We also have no means to know whether > > the > > full set of rules has ever been run by udev for the device at hand, > > because we don't know the set of rules that follow after 11-dm- > > mpath.rules. > > multipathd already refuses to reload newly created devices before it > sees the creation uevents to try to avoid this. Right, thanks for reminding me about that. > I assume that the > problem you are seeing is on the coldplug after the pivot root, where > the devices already exist, or am I confused here. No, that's exactly the situation in which the problem is observed. > I wonder if we can do > something similar where we would ideally delay all device reloads > until > after the coldplug. The problem is figuring out when it's finished. The current state of affairs is that patch 4 in my series has avoided the issue on coldplug events, greatly reducing the frequency of errors on the test sysstem. There's a variant that needs an additional fix: 1 map set up in initrd 2 creation uevent event is observed by multipathd 3 switch root 4 path addition, map reload 5 change event 6 path addition, another map reload (5-6 can repeat) 7 DM_SUSPENDED is observed by udev while handling the change event, and saved as DM_UDEV_DISABLE_OTHER_RULES_FLAG 8 Coldplug event, DM_UDEV_DISABLE_OTHER_RULES_FLAG is restored from udev db => pvscan is skipped I've created another patch to fix this situation, and am now waiting for for test feedback. I'll post about this in more detail later. In short, I believe 10-dm.rules should differentiate DM_UDEV_DISABLE_OTHER_RULES_FLAG originating from lvm itself and DM_SUSPENDED. Regards, Martin