On Tue, Dec 13, 2022 at 08:46:28AM +0100, Martin Wilck wrote: > On Tue, 2022-12-06 at 18:37 -0600, Benjamin Marzinski wrote: > > On Sat, Dec 03, 2022 at 12:43:38AM +0100, mwilck@xxxxxxxx wrote: > > > From: Martin Wilck <mwilck@xxxxxxxx> > > > > > > To check whether we will be able to add a given device can be part > > > of a multipath map, we have two tests in check_path_valid(): > > > released_to_systemd() and the O_EXCL test. The former isn't helpful > > > if "multipath -u" is called for the first time for a given device, > > > and the latter is only used in the "find_multipaths smart" case, > > > because > > > actively opening the device with O_EXCL, even for a very short > > > time, is prone > > > to races with other processes. > > > > > > It turns out that this may cause issues in some scenarios. We saw > > > problems in > > > once case where "find_multipaths greedy" was used with a single > > > non-multipahted root disk and a very large number of multipath > > > LUNs. > > > The root disk would first be classified as multipath device. > > > multipathd > > > would try to create a map, fail (because the disk was mounted) and > > > trigger another uevent. But because of the very large number of > > > multipath > > > devices, this event was queued up behind thousands of other events, > > > and > > > the root device timed out eventually. > > > > > > While a simple workaround for the given problem would be proper > > > blacklisting > > > or using a different find_multipaths mode, I am proposing a > > > different > > > solution here. An additional test is added in is_path_valid() which > > > checks whether the given device is currently in use by 1. sysfs > > > holders, > > > 2. mounts (from /proc/self/mountinfo) or 3. swaps (from > > > /proc/swaps). 2. > > > and 3. are similar to systemd's device detection after switching > > > root. > > > This must not only be done for the device itself, but also for all > > > its > > > partitions. For mountinfo and swaps, libmount is utilized. > > > > > > With this patch, "multipath -u" will make devices with mounted or > > > otherwise > > > used partitions available to systemd early, without waiting for > > > multipathd > > > to fail setting up the map and re-triggering an uevent. This should > > > avoid > > > the issue described above even without blacklisting. The downside > > > of it > > > is a longer runtime of "multipath -u" for almost all devices, in > > > particular > > > for real multipath devices. The runtime required for the new checks > > > was in the > > > order of 0.1ms-1ms in my tests. Moreover, there is a certain risk > > > that devices may > > > wrongly classified as non-multipath because of transient mounts or > > > holders > > > created by other processes. > > > > > > To make this code compile on older distributions, we need some > > > additional > > > checks in create-config.mk. > > > > > Two small issues below. > > > > > Signed-off-by: Martin Wilck <mwilck@xxxxxxxx> > > > --- > > > .github/workflows/build-and-unittest.yaml | 2 +- > > > create-config.mk | 11 +- > > > libmpathutil/libmpathutil.version | 6 + > > > libmpathutil/util.c | 12 + > > > libmpathutil/util.h | 2 + > > > libmultipath/Makefile | 2 +- > > > libmultipath/alias.c | 11 - > > > libmultipath/valid.c | 267 > > > ++++++++++++++++++++++ > > > tests/Makefile | 2 +- > > > tests/valid.c | 48 ++++ > > > 10 files changed, 348 insertions(+), 15 deletions(-) > > > > > > > > > +static int read_partitions(const char *syspath, vector parts) > > > +{ > > > + struct scandir_result sr = { .n = 0 }; > > > + char path[PATH_MAX], *last; > > > + char *prop; > > > + int i; > > > + > > > + strlcpy(path, syspath, sizeof(path)); > > > + sr.n = scandir(path, &sr.di, subdir_filter, NULL); > > > + if (sr.n == -1) > > > + return -errno; > > > + > > > + pthread_cleanup_push_cast(free_scandir_result, &sr); > > > + > > > + /* parts[0] is the whole disk */ > > > + if (vector_alloc_slot(parts) && > > > + (prop = strdup(strrchr(path, '/') + 1)) != NULL) > > > + vector_set_slot(parts, prop); > > > > We should add the whole disk like we do for the partitions below, > > where > > we strdup() the name first, and free it if we can't allocate a slot. > > Otherwise, if the strdup failed, we could leave an empty slot in the > > vector, which would cause vector_foreach_slot() to stop early > > Sure. > > > > > int > > > is_path_valid(const char *name, struct config *conf, struct path > > > *pp, > > > bool check_multipathd) > > > { > > > int r; > > > int fd; > > > + const char *prop; > > > > > > if (!pp || !name || !conf) > > > return PATH_IS_ERROR; > > > @@ -80,6 +338,10 @@ is_path_valid(const char *name, struct config > > > *conf, struct path *pp, > > > if (!pp->udev) > > > return PATH_IS_ERROR; > > > > > > + prop = udev_device_get_property_value(pp->udev, "DEVTYPE"); > > > + if (prop == NULL || strcmp(prop, "disk")) > > > + return PATH_IS_NOT_VALID; > > > + > > > r = pathinfo(pp, conf, DI_SYSFS | DI_WWID | DI_BLACKLIST); > > > if (r == PATHINFO_SKIPPED) > > > return PATH_IS_NOT_VALID; > > > @@ -96,6 +358,11 @@ is_path_valid(const char *name, struct config > > > *conf, struct path *pp, > > > return PATH_IS_ERROR; > > > } > > > > > > + if ((conf->find_multipaths == FIND_MULTIPATHS_GREEDY || > > > + conf->find_multipaths == FIND_MULTIPATHS_SMART) && > > > + is_device_in_use(pp->udev) > 0) > > > + return PATH_IS_NOT_VALID; > > > + > > > > Even if multipath is set to "smart", the common case will still be > > that > > the device is in the wwids file, so we know that it's valid. I don't > > see > > a reason to do this check in that case. I was thinking that for the > > "smart" case, this check should go at the end of the function, before > > we > > return PATH_IS_MAYBE_VALID. Or do you have a reason why we should do > > it > > here that I'm missing? > > I'd argue the other way around: if the device is in use, it doesn't > matter any more whether or not it's listed in the WWIDs file. This > situation can only occur if something went wrong beforehand (most > probably, an inconsistent WWIDs file in the initrd, or some other > component misbehaving). If we tag such a device VALID although it's > mounted, we'll almost certainly fall into emergency mode, regardless of > the find_multipaths mode we're in. The reason is that multipathd will > fail to map the mounted device, and systemd will consider the path > device unusable, leading to the device being inaccessible either way. Sure. We can leave it like this. I don't have strong opinions one way or the other. -Ben > I agree that the device being in use while listed in the WWIDs file is > an error and should probably be logged more prominently. > > Martin -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel