Hi Greg, > > On Fri, Jan 12, 2024 at 09:28:04AM +0000, Xu Yang wrote: > > Hi Greg, > > > > > > > > On Fri, Jan 12, 2024 at 09:24:11AM +0100, Greg KH wrote: > > > > On Fri, Jan 12, 2024 at 04:01:08PM +0800, Xu Yang wrote: > > > > > +void usb_role_switch_get_modules(struct device *dev) > > > > > +{ > > > > > + while (dev) { > > > > > + if (dev->driver) > > > > > + WARN_ON(!try_module_get(dev->driver->owner)); > > > > > > > > You just crashed all systems that have panic-on-warn enabled, which is > > > > by far (i.e. in the billions) the huge majority of Linux systems in the > > > > world. > > > > > > > > If this is something that can fail, then properly handle the issue, > > > > don't just give up and say "let's fill the kernel log with a mess and > > > > reboot the box!". > > > > > > Ah, I see now you are just moving the code, but please, let's fix this > > > properly, don't persist in the wrong code here. > > > > This is a true module dependency issue and it only occurs when > > load/unload modules. The dependency of usb controller glue driver, > > usb controller driver and the user driver (such as tcpm) of usb role > > switch is not correctly established. > > Then the driver model is not being used properly, as no module > references should be needed here. Taking below diagram as example: ci_hdrc.0 register usb get tcpm_port (driver: ci_hdrc) ---------> role <---- (driver: tcpm) ^ ^ switch | ^ | | | | +1| | +1 | |+1 | +-------------------------------------+ | | | 4c200000.usb 1-0050 (driver: ci_hdrc_imx) (driver: tcpci) 1. Driver ci_hdrc_imx and tcpci are built as module at least. 2. When module ci_hdrc_imx is loaded, it will register ci_hdrc.0 device and try to get ci_hdrc module's reference. ci_hdrc will register usb-role-switch device. 3. When module tcpci is loaded, it will register tcpm port device and try to get tcpm module's reference. The tcpm module will get usb-role-switch which is registered by ci_hdrc. In current design, tcpm will also try to get ci_hdrc module's reference after get usb-role-switch. > > > This patch is used to adjust dependency of them, without it, two issues > > may happen: > > 1. "NULL pointer dereference" kernel dump will be shown > > For when? 4. Due to no modules depend on ci_hdrc_imx, ci_hdrc_imx can be manually unloaded. Then device ci_hdrc.0 will be removed by ci_hdrc_imx and device usb-role-switch is also unregistered. 5. Then, if I try to unload module tcpci, "NULL pointer dereference" will be shown due to below code: module_put(sw->dev.parent->driver->owner); parent->driver is NULL at this time. > > > 2. The reference count of usb controller module may never decrease to 0 > > The reference shouldn't have been increased as you want to be able to > unload the driver if a device is still present in the system. > > So I think there's a larger issue here, module references shouldn't be > incremented just if a driver binds to a device, right? Only if other > code is using the module, which is different. 6. If ci_hdrc is also built as module, the module reference will keep at value 1 at least due to it failed to subtract 1 at step 5. This issue is observed on chipidea and dwc3, the usb glue driver shouldn't be unloaded befer tcpci module. Thanks, Xu Yang > > thanks, > > greg k-h