On 10/11/23 01:30, Sean Christopherson wrote:
E.g. drivers/block/loop.c has this gem
/* This is safe: open() is still holding a reference. */
module_put(THIS_MODULE);
in __loop_clr_fd(), which is invoked from a .release() function. So open() quite
clearly doesn't hold a reference, unless the comment is talking about the reference
that was obtained by the core file systems layer and won't be put until after
.release() completes. But then what on earth is the point of doing
module_get(THIS_MODULE) and module_put(THIS_MODULE)?
Here the module_put() is called not just from .release() in autoclear
mode, but also from the LOOP_CLR_FD ioctl.
So the idea here is that while a /dev/loopN device exists you must keep
the module alive, to ensure that the devices don't disappear from /dev.
So the point here is to keep the module alive after /dev/loop-control
has been closed; but if /dev/loopN is open it will keep the module alive
on its own, and this makes module_get/module_put safe in this particular
case.
In general, safety is guaranteed if module_put is called while the
module's reference count is still elevated by someone else, which could
be a file descriptor or some core subsystem. But then you're right that
in many case there seems to be no point in doing module_get/module_put.
In drivers/watchdog/softdog.c, softdog_stop() is called while the
module's reference count is still elevated by the core watchdog code,
which makes the code safe. But why does softdog.c need to have its own
reference? Any attempt to unregister the softdog module will go through
hrtimer_cancel(&softdog_ticktock) - which waits for the timer callback
to be complete, just like flush_work() in your patch.
This module_get/module_put _is_ unnecessary. It was introduced as a
bandaid in commit 5889f06bd31d ("watchdog: refuse to unload softdog
module when its timer is running", 2015-12-27). Back then the core code
wasn't careful to keep the module refcount elevated if the watchdog was
still running in watchdog_release. When commit ee142889e32f ("watchdog:
Introduce WDOG_HW_RUNNING flag", 2016-03-16) fixed the race for real,
softdog.c wasn't adjusted.
I agree that in many cases, however, the safety does not seem to be
there. I cannot find a place that ensures that snd-aoa-soundbus-i2sbus
is kept alive while i2sbus_private_free runs, for example (though that
code is a maze).
Your patch 2 looks good, but perhaps instead of setting the owner we
could stash the struct module* in a global, and try_get/put it from open
and release respectively? That is, .owner keeps the kvm module alive
and the kvm module keeps kvm-intel/kvm-amd alive. That would subsume
patches 1 and 3.
Paolo