Hi Michal, On Wed, 28 Jun 2023 12:30:35 +0200, Michal Hocko wrote: > On Mon 26-06-23 12:32:52, Jean Delvare wrote: > > If module_put() triggers a refcount error, include the culprit > > module name in the warning message, to easy further investigation of > > the issue. > > > > Signed-off-by: Jean Delvare <jdelvare@xxxxxxx> > > Suggested-by: Michal Hocko <mhocko@xxxxxxxx> > > Cc: Luis Chamberlain <mcgrof@xxxxxxxxxx> > > --- > > kernel/module/main.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > --- linux-6.3.orig/kernel/module/main.c > > +++ linux-6.3/kernel/module/main.c > > @@ -850,7 +850,9 @@ void module_put(struct module *module) > > if (module) { > > preempt_disable(); > > ret = atomic_dec_if_positive(&module->refcnt); > > - WARN_ON(ret < 0); /* Failed to put refcount */ > > + WARN(ret < 0, > > + KERN_WARNING "Failed to put refcount for module %s\n", > > + module->name); > > Would it make sense to also print the refcnt here? In our internal bug > report it has turned out that this was an overflow (put missing) rather > than an underflow (too many put calls). Seeing the value could give a > clue about that. We had to configure panic_on_warn to capture a dump to > learn more which is rather impractical. Well, other calls to module_put() or try_module_get() could happen in parallel, so at the time we print refcnt, its value could be different from the one which triggered the WARN. Additionally, catching an overflow in module_put() is counterintuitive, it only works by accident because the counter gets to negative values. If we really want to reliably report overflows as such then we should add a dedicated WARN to try_module_get(). Doesn't look trivial though. With my proposed implementation, I don't think it's necessary to turn on panic_on_warn to debug further. Once you know which module is culprit, enabling tracing for this specific module should give you all the details you need to figure out what's going on. -- Jean Delvare SUSE L3 Support