On Thu, 2018-04-12 at 13:36 -0500, Benjamin Marzinski wrote: > On Wed, Apr 04, 2018 at 06:16:23PM +0200, Martin Wilck wrote: > > In "find_multipaths smart" mode, use time stamps under > > /dev/shm/multipath/find_multipaths to track waiting for multipath > > siblings. When a path is first encountered and is "maybe" > > multipath, create a > > file under /dev/shm, set its modification time to the expiry time > > of the > > timer, and set the FIND_MULTIPATHS_WAIT_UNTIL variable. On later > > calls, also set > > FIND_MULTIPATHS_WAIT_UNTIL to the expiry time (but don't change the > > time > > stamp) if it's not expired yet, or 0 if it is expired. Set > > FIND_MULTIPATHS_WAIT_UNTIL even if enough evidence becomes > > available to decide > > if the path needs to be multipathed - this enables the udev rules > > to detect > > that this is a device a timer has been started for, and stop it. By > > using > > /dev/shm, we share information about "smart" timers between initrd > > and root > > file system, and thus only calculate the timeout once. > > > > Signed-off-by: Martin Wilck <mwilck@xxxxxxxx> > > --- > > libmultipath/file.c | 2 +- > > libmultipath/file.h | 1 + > > multipath/main.c | 133 > > ++++++++++++++++++++++++++++++++++++++++++++++++++++ > > 3 files changed, 135 insertions(+), 1 deletion(-) > > > > + > > +/** > > + * find_multipaths_check_timeout(wwid, tmo) > > + * Helper for "find_multipaths smart" > > + * > > + * @param[in] pp: path to check / record > > + * @param[in] tmo: configured timeout for this WWID, or value <= 0 > > for checking > > + * @param[out] until: timestamp until we must wait, > > CLOCK_REALTIME, if return > > + * value is FIND_MULTIPATHS_WAITING > > + * @returns: FIND_MULTIPATHS_WAIT_DONE, if waiting has finished > > + * @returns: FIND_MULTIPATHS_ERROR, if internal error occured > > + * @returns: FIND_MULTIPATHS_NEVER, if tmo is 0 and we didn't wait > > for this > > + * device > > + * @returns: FIND_MULTIPATHS_WAITING, if timeout hasn't expired > > + */ > > +static int find_multipaths_check_timeout(const struct path *pp, > > long tmo, > > + struct timespec *until) > > +{ > > + char path[PATH_MAX]; > > + struct timespec now, ftimes[2], tdiff; > > + struct stat st; > > + long fd; > > + int r, err, retries = 0; > > + > > + clock_gettime(CLOCK_REALTIME, &now); > > + > > I'm worried about using pp->dev_t here with no method of removing > these > files. What happens when a path device, say 8:32 is removed and a > completely new device comes in reusing the same dev_t? Hm, what else should we use? devnode name is even worse, and most other options are a can of worms... In the worst case, the new device wouldn't be waited for (or not long enough), because the marker existed before it was detected. I could simply add a rule that removes the marker in case of a "remove" uevent, ok? > > > + if (snprintf(path, sizeof(path), "%s/%s", shm_find_mp_dir, > > pp->dev_t) > > + >= sizeof(path)) { > > + condlog(1, "%s: path name overflow", __func__); > > shouldn't this be: > return FIND_MULTIPATHS_ERROR; Bah, thank for catching it. > > static int print_cmd_valid(int k, const vector pathvec, > > struct config *conf) > > { > > static const int vals[] = { 1, 0, 2 }; > > + int wait = FIND_MULTIPATHS_NEVER; > > + struct timespec until; > > + struct path *pp; > > > > if (k < 0 || k >= sizeof(vals)) > > return 1; > > > > + if (k == 2) { > > + /* > > + * Caller ensures that pathvec[0] is the path to > > + * examine. > > + */ > > + pp = VECTOR_SLOT(pathvec, 0); > > + select_find_multipaths_timeout(conf, pp); > > + wait = find_multipaths_check_timeout( > > + pp, pp->find_multipaths_timeout, &until); > > + if (wait != FIND_MULTIPATHS_WAITING) > > + k = 1; > > + } else if (pathvec != NULL) { > > + pp = VECTOR_SLOT(pathvec, 0); > > + wait = find_multipaths_check_timeout(pp, 0, > > &until); > > + } > > + if (wait == FIND_MULTIPATHS_WAITING) > > + printf("FIND_MULTIPATHS_WAIT_UNTIL=\"%ld.%06ld\"\n > > ", > > + until.tv_sec, until.tv_nsec/1000); > > + else if (wait == FIND_MULTIPATHS_WAIT_DONE) > > + printf("FIND_MULTIPATHS_WAIT_UNTIL=\"0\"\n"); > > If we get an error trying to check the timeout, should we just keep > FIND_MULTIPATHS_WAIT_UNTIL the same? Or would it be better to set it > to > 0, and fail the smart claiming? The latter is what we do. We set k=1 if find_multipaths_check_timeout() returns anything but FIND_MULTIPATHS_WAITING, resulting in DM_MULTIPATH_DEVICE_PATH="0". Regards Martin -- Dr. Martin Wilck <mwilck@xxxxxxxx>, Tel. +49 (0)911 74053 2107 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel