On 9/28/2017 7:46 PM, Govindarajulu Varadarajan wrote: >> How about releasing the device_lock here on CPU0?> > > pci_device_add() is called by driver's pci probe function. device_lock(dev) > should be held before calling pci driver probe function. I see. The goal of the lock held here is to protect probe() operation from being disrupted. I also don't think we can change this. > >> or in other words keep device_lock as short as possible? > > The problem is not the duration device_lock is held. It is the order two locks > are aquired. We cannot control or implement a restriction that during > device_lock() is held, driver probe should not call pci function which aquires > pci_bus_sem. And in case of pci aer, aer handler needs to call driver err_handler() > for which we need to hold device_lock() before calling err_handler(). In order > to find all the devices on a pci bus, we should hold pci_bus_sem to do > pci_walk_bus(). I was reacting to this to see if there is a better way to do this. "Only fix I could think of is to lock &pci_bus_sem and try locking all device->mutex under that pci_bus. If it fails, unlock all device->mutex and &pci_bus_sem and try again." How about gracefully returning from report_error_detected() when we cannot obtain the device_lock() by replacing it with device_trylock()? aer_pci_walk_bus() can still poll like you did until it gets the lock. At least, we don't get to introduce a new API, new lock semantics and code refactoring. __pci_bus_trylock() looked very powerful and also dangerously flexible to introduce new bugs to me. For instance, you called it like this. + down_read(&pci_bus_sem); + locked = __pci_bus_trylock(bus, pci_device_trylock, + pci_device_unlock); pci_bus_trylock() would obtain device + cfg locks whereas pci_device_trylock() only obtains the device lock. Can it race against cfg lock? It depends on the caller. Very subtle difference. -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.