On Fri, Jul 17, 2020 at 2:16 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > On Fri, Jul 17, 2020 at 02:01:16PM +0200, Linus Walleij wrote: > > While I am a big fan of the Android GKI initiative this needs to be aligned > > with the Linux core maintainers, so let's ask Greg. I am also paging > > John Stultz on this: he is close to this action. > > > > They both know the Android people very well. > > > > So there is a rationale like this going on: in order to achieve GKI goals > > and have as much as possible of the Linux kernel stashed into loadable > > kernel modules, it has been elevated to modus operandi amongst > > the developers pushing this change that it is OK to pile up a load of > > modules that cannot ever be unloaded. > > Why can't the module be unloaded? Is it just because they never > implement the proper "remove all resources allocated" logic in a remove > function, or something else? For the core kernel parts, it's usually for the lack of tracking of who is using the resource provided by the driver, as the subsystems tend to be written around x86's "everything is built-in" model. For instance, a PCIe host bridge might rely on the IOMMU, a clock controller, an interrupt controller, a pin controller and a reset controller. The host bridge can still be probed at reduced functionality if some of these are missing, or it can use deferred probing when some others are missing at probe time. If we want all of drivers to be unloaded again, we need to do one of two things: a) track dependencies, so that removing one of the devices underneath leads to everything depending on it to get removed as well or will be notified about it going away and can stop using it. This is the model used in the network subsystem, where any ethernet driver can be unloaded and everything using the device gets torn down. b) use reference counting on the device or (at the minimum) try_module_get()/module_put() calls for all such resources so as long as the pci host bridge is there, so none of the devices it uses will go away when they are still used. Traditionally, we would have considered the PCIe host bridge to be a fundamental part of the system, implying that everything it uses is also fundamental, and there was no need to track usage at all, just to ensure the probing is done in the right order. > > As a minimum requirement I would expect this to be marked by > > > > struct device_driver { > > (...) > > /* This module absolutely cannot be unbound */ > > .suppress_bind_attrs = true; > > }; > > No, that's not what bind/unbind is really for. That's a per-subsystem > choice as to if you want to allow devices to be added/removed from > drivers at runtime. It has nothing to do with module load/unload. It's a one-way dependency: If we can't allow the device to be unbound, then we also should not allow module unloading because that forces an unbind. Arnd