On Tue, 8 Mar 2005, Patrick Mochel wrote: > > How would you tell the core or a bus driver not to add/delete devices or > > bind/unbind drivers? The logical answer is by locking a semaphore or the > > equivalent. This would be tantamount to reintroducing the bus rwsem! > > No, the rwsem is a bad idea. That locks the entire bus and prevents > anything from happening to any device. That's not good. It's best to have > a per-device lock for that type of exclusion. And, you really do want to > exclude a device power transition from happening while a driver is > binding/releasing from a device. > > Ultimately, that would be a driver-based operation, and the lock would > exist in a driver-allocated, per-device structure. But, we could easily > add a semaphore to struct device that we take around ->probe(), > ->release() and power state transition calls. > > [ Actually, I recently added a spinlock to struct device for a similar > purpose, so it'll be trivial to change it to a semaphore, which we need > for this purpose, since ->probe() or ->suspend() could easily sleep on > memory allocations. ] Here's a proposal for a way to eliminate the rwsem. Still in its initial stages; it needs more thinking out but I believe it can be made to work. (After reading it, though, you may decide that you dislike it even more than the present system!) It's somewhat complicated, and unavoidably so. The main difficulty comes from the need to accurately traverse driver lists during device_add and device lists during driver_register and driver_unregister. First of all, as several people have suggested, there should be a semaphore in struct device. It will be held across all calls to driver callbacks: probe, remove, shutdown, suspend, resume, and anything else that might get added later. Thus drivers won't have to worry about two things happening to the same device at once. (Note that this will prevent a driver from unregistering its device during a resume callback, unless an entry point is added in which the caller is required to hold the semaphore already.) In fact, during device_add and device_del the parent's semaphore will be needed as well (reason given below). The rule for locking order is: whenever a parent and a child must both be locked, the parent's lock should be acquired first. While it would require the least change to existing drivers if the core acquired the parent's lock, there will certainly be occasions where the caller will already hold the parent's lock. (USB is like this already.) So device_add and device_del need three entry points each: the normal one where the caller holds no semaphores, one where the caller has locked the parent, and one where the caller has locked both the parent and the device. Or we could just require callers always to hold both semaphores. I have no idea how much work would be required to make this so. The real trick is to maintain the device and driver lists. dev->node can be set directly during driver_add and driver_del. The parent has to be locked for this operation, since adding dev->node involves modifying parent->children. By contrast, it might not be possible to set dev->bus_list and dev->driver_list right away. Here's the idea: Only one process is allowed to use those lists at a time. Maintenance of the lists is synchronized by means of a subsystem spinlock. If a process needs to update or traverse the lists while they are already in use, it will instead queue a request to be handled later. When a process finishes using the lists, it will check the queue for outstanding requests and fulfill them. So for example, during device_add if the lists are in use, a request will be queued to fill in dev->bus_list and to call device_attach. (The sysfs stuff and other manipulations can be handled immediately.) If the lists aren't in use then device_add will do everything normally, then handle any queued requests that accumulated while it was busy probing drivers and so on. The upshot is that the bus_list and driver_list entries won't always be up-to-date. (Nor will the list of registered drivers.) Not unless the request queue is empty. Since the driver-model core is the only user of those lists, this should be okay. The kinds of requests that will be queued are: Make sure that dev->bus_list is correct (set if dev is registered, unset if dev has been unregistered) and that dev->driver_list is correct (on the list headed by dev->driver, or not on any list if dev->driver is NULL). Probe dev against all registered drivers, if dev is still registered and hasn't yet been bound to a driver. Add drv to the list of registered drivers and probe all devices for drv, if drv is still registered. Note that driver_unregister must unbind the driver from all its devices, even those not yet on the driver_list. Hence it must mark the driver as no longer registered and drain the request queue before it can safely do all the unbindings and return. This implies that struct device_driver will have to include a way of telling whether the driver has been unregistered. This is a complex scheme. But it has to be; there's no way for device_attach, driver_attach, and driver_detach to iterate through the lists reliably if the lists are subject to concurrent change. If the lists are busy, these routines have only two choices. They can either queue a request for an update and return immediately, or else wait until the lists aren't busy -- which is exactly what the current rwsem scheme does. Comments? Alan Stern