Hi Johannes, On 22 May 2014 16:41, Johannes Berg <johannes@xxxxxxxxxxxxxxxx> wrote: > On Thu, 2014-05-22 at 16:12 +0200, Michal Kazior wrote: > >> As a side effect there's no way to propagate >> registering errors to the pci subsystem but this >> probably isn't really necessary. > > You should probably unbind if it fails - iwlwifi does that. Hmm.. As it stands ath10k now deadlocks when firmware probing worker fails and calls device_release_driver(). This indirectly hits pci remove() callback which waits for the worker. Oops. So I've taken a look how other driver do it so I can come up with a solution - most of them seem use completions. After taking a closer look I came to a conclusion this is inherently racy too. Consider the following scenario: [syscall] insmod pci->probe() queue_work() return [worker] probe_fw() [syscall] rmmod dev_lock() pci->remove() wait_for_completion() [worker] complete_all() device_release_driver() dev_lock() {already held, yield} [syscall] free(internal structures) dev_unlock() return [worker] {woken up, but dev->driver == NULL so no callbacks} dev_unlock() return The driver code section may not be reachable anymore upon worker returning from the device_release_driver() call, right? Also since ath10k uses an internal worker it also means the work_struct would be already freed by the syscall flow (i.e. worker would run after driver has supposedly been cleaned up..). Even if ath10k was to use request_firmware_nowait(), which allocates a temporary work_struct, the unreachable code section still remains a problem. Or maybe this isn't really a problem and/or I'm missing something? Michał -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html