On Wed, May 10, 2023 at 01:50:01PM -0700, Stephen Boyd wrote: > Quoting Dmitry Torokhov (2023-05-10 13:24:08) > > On Wed, May 10, 2023 at 11:51:31AM -0700, Stephen Boyd wrote: > > > Quoting Dmitry Torokhov (2023-05-05 17:06:07) > > > > On Fri, May 05, 2023 at 04:24:16PM -0700, Stephen Boyd wrote: > > > > > > > > > ... > > > > > Unfortunately, the hid google hammer driver hand rolls a devm function > > > > > to call hid_hw_stop() when the driver is unbound and implements an > > > > > hid_driver::remove() function. The driver core doesn't call the devm > > > > > release functions until _after_ the bus unbinds the driver, so the order > > > > > of operations is like this: > > > > > > > > Excellent analysis, but the problem is not limited to the hammer driver > > > > (potentially) and shalt be dealt with appropriately, at the HID bus > > > > level. > > > > > > Thanks. I thought of the bus level approach as well, but I was trying to > > > keep the fix isolated to the driver that had the problem. I'd like to > > > get the fix into the stable kernel, as this fixes a regression > > > introduced by commit d950db3f80a8 ("HID: google: switch to devm when > > > registering keyboard backlight LED") in v5.18. > > > > > > Is the bus level approach going to be acceptable as a stable backport? > > > > Sure, why not given the kind of stuff flowing into stable kernels. At > > least this would be fixing real issue that can be triggered with a real > > device. > > Hmm, ok. I was worried it would be too much "new code" vs. fixing > something. > > > > > > > This got me thinking that maybe both of these approaches are wrong. > > > Maybe the call to hid_close_report() should be removed from > > > hid_device_remove() instead. > > > > > > The device is being removed from the bus when hid_device_remove() is > > > called, but it hasn't been released yet. Other devices like the hidinput > > > device are referencing the hdev device because they set the hdev as > > > their parent. Basically, child devices are still bound to some sort of > > > driver or subsystem when the parent hdev is unbound from its driver, > > > leading to a state where the child drivers could still access the hdev > > > while it is being destroyed. If we remove the hid_close_report() call > > > from this function it will eventually be called by hid_device_release() > > > when the last reference to the device is dropped, i.e. when the child > > > devices all get destroyed. In the case of hid-google-hammer, that would > > > be when hid_hw_stop() is called from the devm release function by driver > > > core. > > > > > > The benefit of this approach is that we don't allocate a devres group > > > for all the hid devices when only two drivers need it. The possible > > > downside is that we keep the report around while the device exists but > > > has no driver bound to it. > > > > > > Here's a totally untested patch for that. > > > > > > ---8<---- > > > diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c > > > index 22623eb4f72f..93905e200cae 100644 > > > --- a/drivers/hid/hid-core.c > > > +++ b/drivers/hid/hid-core.c > > > @@ -1211,8 +1211,8 @@ int hid_open_report(struct hid_device *device) > > > hid_parser_reserved > > > }; > > > > > > - if (WARN_ON(device->status & HID_STAT_PARSED)) > > > - return -EBUSY; > > > + if (device->status & HID_STAT_PARSED) > > > + hid_close_report(device); > > > > > > start = device->dev_rdesc; > > > if (WARN_ON(!start)) > > > @@ -2662,7 +2662,6 @@ static void hid_device_remove(struct device *dev) > > > hdrv->remove(hdev); > > > else /* default remove */ > > > hid_hw_stop(hdev); > > > - hid_close_report(hdev); > > > hdev->driver = NULL; > > > } > > > > This will probably work, but it I consider this still being fragile as > > at some point we might want to add some more unwinding, and we'll run > > into this issue again. I would feel much safer if the order of release > > followed (inversely) order of allocations more closely. > > > > Sorry, I'm not following here. How is it fragile? Are you saying that if > we want to add devm calls into the bus layer itself the order of release > won't be inverse of allocation/creation? What I was trying to say is that later someone else might be tempted to add more traditional-style resources and non-devm-unwinding for them. Having an explicit devres groups gives exact point when driver-allocated resources are released, and makes patch authors take it into consideration. If everything is devm-controlled then we do not need a separate devres group. Thanks. -- Dmitry