On Thu, Sep 23, 2021 at 01:51:56AM -0700, Lucas De Marchi wrote: > On Mon, Aug 09, 2021 at 10:16:02PM -0700, Luis Chamberlain wrote: > The story was not kind like that. It wasn't removed "in favor for a 10 > second sleep" in the sense that the sleep would replace the wait. > > It was actually for "this wait logic in the kernel is complex and > buggy, let's try to remove it". So we decided to deprecate it and add > a sleep rmmod to see if anyone complained. 1 year later of no complains > we removed it from kernel. This was all after noticing we had never > implemented the wait logic in modprobe - it was only done in rmmod. OK fixed the commit log thanks! > > --- a/libkmod/libkmod-module.c > > +++ b/libkmod/libkmod-module.c > > @@ -30,6 +30,9 @@ > > #include <stdlib.h> > > #include <string.h> > > #include <unistd.h> > > +#include <poll.h> > > +#include <time.h> > > +#include <math.h> > > #include <sys/mman.h> > > #include <sys/stat.h> > > #include <sys/syscall.h> > > @@ -802,6 +805,143 @@ KMOD_EXPORT int kmod_module_remove_module(struct kmod_module *mod, > > return err; > > } > > > > +static int timespec_to_ms(struct timespec *t) > > +{ > > + return (t->tv_sec * 1000) + lround(t->tv_nsec / 1000000); > > +} > > + > > +static int time_delta_ms(struct timespec *before, struct timespec *after) > > +{ > > + if (!before || !after) > > + return 0; > > + return timespec_to_ms(after) - timespec_to_ms(before); > > +} > > we have a similar thing in util.[ch] Alright, this OK? diff --git a/shared/util.c b/shared/util.c index b487b5f..b911e63 100644 --- a/shared/util.c +++ b/shared/util.c @@ -466,6 +466,19 @@ unsigned long long ts_usec(const struct timespec *ts) (unsigned long long) ts->tv_nsec / NSEC_PER_USEC; } +unsigned long long ts_msec(const struct timespec *ts) +{ + return ts_usec(ts) * 1000; +} + +unsigned long long ts_delta_ms(const struct timespec *before, + const struct timespec *after) +{ + if (!before || !after) + return 0; + return ts_msec(after) - ts_msec(before); +} + unsigned long long stat_mstamp(const struct stat *st) { #ifdef HAVE_STRUCT_STAT_ST_MTIM diff --git a/shared/util.h b/shared/util.h index c6a31df..f8c28e7 100644 --- a/shared/util.h +++ b/shared/util.h @@ -43,6 +43,9 @@ int mkdir_p(const char *path, int len, mode_t mode); int mkdir_parents(const char *path, mode_t mode); unsigned long long stat_mstamp(const struct stat *st); unsigned long long ts_usec(const struct timespec *ts); +unsigned long long ts_msec(const struct timespec *ts); +unsigned long long ts_delta_ms(const struct timespec *before, + const struct timespec *after); /* endianess and alignments */ /* ************************************************************************ */ > > +/** > > + * kmod_module_remove_module_wait: > > + * @mod: kmod module > > + * @flags: flags to pass to Linux kernel when removing the module. The only valid flag is > > + * KMOD_REMOVE_FORCE: force remove module regardless if it's still in > > + * use by a kernel subsystem or other process; > > + * KMOD_REMOVE_NOWAIT is always enforced, causing us to pass O_NONBLOCK to > > + * delete_module(2). We do the waiting in userspace, if a wait was desired. > > + * > > + * Remove a module from Linux kernel patiently. > > + * > > + * Returns: 0 on success or < 0 on failure. > > + */ > > +KMOD_EXPORT int kmod_module_remove_module_wait(struct kmod_module *mod, > > + unsigned int flags, > > + bool wait) > > why do you have kmod_get_refcnt_timeout/kmod_set_refcnt_timeout instead > of just doing s/bool wait/unsigned int wait_msec/)? Because it lets us do a smaller change on the respetive tools: tools/modprobe.c- flags |= KMOD_REMOVE_FORCE; tools/modprobe.c- tools/modprobe.c: err = kmod_module_remove_module_wait(mod, flags, do_remove_patient); tools/modprobe.c- if (err == -EEXIST) { tools/modprobe.c- if (!first_time) -- tools/remove.c- goto unref; tools/remove.c- tools/remove.c: err = kmod_module_remove_module_wait(mod, 0, do_remove_patient); tools/remove.c- if (err < 0) tools/remove.c- goto unref; -- tools/rmmod.c- } tools/rmmod.c- tools/rmmod.c: err = kmod_module_remove_module_wait(mod, flags, tools/rmmod.c- do_remove_patient); tools/rmmod.c- if (err < 0) { That is, the timeout is contextual of the context. > > + if ((refcnt <= 0) || (refcnt > 0 && !wait)) { > > + NOTICE(mod->ctx, "%s refcnt is %d\n", mod->name, (int) refcnt); > > + err_time = clock_gettime(CLOCK_MONOTONIC, &t2); > > + if (err_time != 0) > > + kmod_set_removal_timeout(mod->ctx, 0); > > I don't follow why kmod_module_get_refcnt_wait() is setting the removal > timeout at all. This seems to be doing it behind users back. Because if clock_gettime() returns something other than 0 then your clock is messed up and you should not be using a timeout, so yes, we correct that then. We can scream loud, or use a default. I figured not using one would be better in that case. > The idea of using the refcnt fd was actually that then > users could integrate it on their mainloops (probably using epoll). And > then the same impl could be shared by kmod_module_remove_module_wait(), > which would do a select(). > > This seems more like a kmod_module_refcnt_wait_zero() using poll() > + adjusting the timeout Sorry don't follow. And since I have one day before vacation, I suppose I won't get to this until I get back. But I'd be happy if you massage it as you see fit as you're used to the code base and I'm sure have a better idea of what likely is best for the library. > > + ret = kmod_module_get_refcnt_wait(mod, do_remove_patient); > > for tool implementation, shouldn't we just ignore > kmod_module_get_refcnt() and proceed to > kmod_module_remove_module_wait()? I'll let you decide. Otherwise this will have to wait until I get back from vacation. Luis