Re: [PATCH 1/6] 11-dm-mpath.rules: don't import properties that are already set

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 06, 2024 at 11:50:46AM +0100, Martin Wilck wrote:
> On Mon, 2024-02-05 at 15:44 -0500, Benjamin Marzinski wrote:
> > On Mon, Feb 05, 2024 at 01:46:33PM +0100, Martin Wilck wrote:
> > > DM_UDEV_DISABLE_OTHER_RULES_FLAG and DM_NOSCAN may be already set
> > > from previous rules, e.g. if the device is suspended. Make sure
> > > we don't overwrite them.
> > > 
> > > DM_DISABLE_OTHER_RULES_FLAG_OLD and MPATH_DEVICE_READY are only
> > > used in this file, and not used in the scan_import code path.
> > > 
> > > Signed-off-by: Martin Wilck <mwilck@xxxxxxxx>
> > > ---
> > >  multipath/11-dm-mpath.rules | 15 +++++++++++----
> > >  1 file changed, 11 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/multipath/11-dm-mpath.rules b/multipath/11-dm-
> > > mpath.rules
> > > index c339f52..2c4d006 100644
> > > --- a/multipath/11-dm-mpath.rules
> > > +++ b/multipath/11-dm-mpath.rules
> > > @@ -2,12 +2,19 @@ ACTION!="add|change", GOTO="mpath_end"
> > >  ENV{DM_UDEV_RULES_VSN}!="?*", GOTO="mpath_end"
> > >  ENV{DM_UUID}!="mpath-?*", GOTO="mpath_end"
> > >  
> > > -IMPORT{db}="DM_DISABLE_OTHER_RULES_FLAG_OLD"
> > > -IMPORT{db}="MPATH_DEVICE_READY"
> > > -
> > >  # If this uevent didn't come from dm, don't try to update the
> > >  # device state
> > > -ENV{DM_COOKIE}!="?*", ENV{DM_ACTION}!="PATH_*",
> > > IMPORT{db}="DM_UDEV_DISABLE_OTHER_RULES_FLAG",
> > > IMPORT{db}="DM_NOSCAN", GOTO="scan_import"
> > > +ENV{DM_COOKIE}=="?*", GOTO="check_ready"
> > > +ENV{DM_ACTION}=="PATH_*", GOTO="check_ready"
> > > +
> > > +ENV{DM_UDEV_DISABLE_OTHER_RULES_FLAG}!="?*",
> > > IMPORT{db}="DM_UDEV_DISABLE_OTHER_RULES_FLAG"
> > > +ENV{DM_NOSCAN}!="?*", IMPORT{db}="DM_NOSCAN"
> > > +GOTO="scan_import"
> > > +
> > 
> > If we do this, don't we forget the values for
> > DM_DISABLE_OTHER_RULES_FLAG_OLD and MPATH_DEVICE_READY whenever we
> > get a
> > non-dm uevent? If we skip importing them for a uevent, they're
> > dropped
> > from the database, which means that on the next dm-originated uevent
> > we
> > won't be able to get the old values. right?
> 
> Right, I didn't consider that. It makes sense for MPATH_DEVICE_READY.
> 
> However, while at it, I wonder about our rationale for saving
> DM_UDEV_DISABLE_OTHER_RULES_FLAG in DM_DISABLE_OTHER_RULES_FLAG_OLD.
> 
> In 10-dm.rules, DM_DISABLE_OTHER_RULES_FLAG changes its value only
>  - in DM_UDEV_PRIMARY_SOURCE_FLAG=1 events, according to the value of
> DM_SUSPENDED
>  - for DISK_RO=1 events (switches the flag on) 
> 
> (11-dm-lvm.rules has some additional logic that doesn't matter for
> multipath).
> 
> For all other events, the flag is restored from the udev db in 10-
> dm.rules. Ignoring DISK_RO, it always has the value that DM_SUSPENDED
> had in the last DM_UDEV_PRIMARY_SOURCE_FLAG=1 (aka map reload) event.
> 
> The logic in 11-dm-mpath.rules adds a check for unusable arrays
> on top, setting DM_DISABLE_OTHER_RULES_FLAG if MPATH_DEVICE_READY
> switches from 1 to 0. In this case we save the previous value in
> DM_DISABLE_OTHER_RULES_FLAG_OLD, and restore it from there in a later
> event if MPATH_DEVICE_READY switches back from 0 to 1.
> 
> IMO the following can happen:
> 
>  1. an event arrives while DM_SUSPENDED=1, causing 
> DM_DISABLE_OTHER_RULES_FLAG=1 to be set
>  2. 11-dm-mpath.rules sees no paths and saves
> DM_DISABLE_OTHER_RULES_FLAG=1 to DM_DISABLE_OTHER_RULES_FLAG_OLD
>  3. an event arrives after the device has resumed, DM_SUSPENDED and
> DM_DISABLE_OTHER_RULES_FLAG are cleared in 10-dm.rules
>  4. 11-dm-mpath.rules sees MPATH_DEVICE_READY=1 and restores
> DM_DISABLE_OTHER_RULES_FLAG, setting it to 1
> 
> ... and this would be wrong, no?
> 
> It seems to me that we should not save DM_DISABLE_OTHER_RULES_FLAG_OLD
> between uevents. We must set DM_DISABLE_OTHER_RULES_FLAG=1 to avoid
> upper layer probing if there are no paths, but I suppose we should
> restore the previous value after udev rule execution, e.g. in 99-dm-
> mpath.rules:
> 
> ENV{DM_DISABLE_OTHER_RULES_FLAG_OLD}=="?*", \
>    ENV{DM_DISABLE_OTHER_RULES_FLAG}=$env{DM_DISABLE_OTHER_RULES_FLAG_OLD}, \
>    ENV{DM_DISABLE_OTHER_RULES_FLAG_OLD}=""
> 
> Am I confusing stuff here?

Nope. That makes a lot of sense.
 
> Unfortunately, testing of my patch series has shown that it's an
> improvement, but it isn't sufficient for completely avoiding races
> between multipathd and udev. DM_SUSPENDED=1 can be avoided, but it's
> still possible that the device gets suspended after the udev rules test
> for supended state and before they run kpartx, blkid, pvscan, or other
> similar commands.
> 
> I am quite clueless about this right now, and would be grateful for
> ideas. Re-implementing LOCK_EX locking would be a possibility for
> systemd >= 250, as noted in the cover letter of the series. But for
> systemd <= 249, I don't see good options. We can't use LOCK_EX, because
> when we release the lock, we have no means to know whether or not a
> race with udev occurred (iow, whether udev tried to access the device
> while we held the lock, failed, and dropped the event). Thus we'd need
> to assume that this was the case, and force a reload after each reload,
> which is obvious nonsense. We also have no means to know whether the
> full set of rules has ever been run by udev for the device at hand,
> because we don't know the set of rules that follow after 11-dm-
> mpath.rules.

multipathd already refuses to reload newly created devices before it
sees the creation uevents to try to avoid this.  I assume that the
problem you are seeing is on the coldplug after the pivot root, where
the devices already exist, or am I confused here.  I wonder if we can do
something similar where we would ideally delay all device reloads until
after the coldplug.  The problem is figuring out when it's finished.

-Ben

> Thanks,
> Martin





[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux