On Sat, 2009-05-23 at 09:51 -0700, Arjan van de Ven wrote: > On Sat, 23 May 2009 11:21:43 -0500 > > The reason scsi_add_device() is failing seems to be that > > async_synchronize_full_domain() is a bit fragile in that it only > > expects to be called once. Call it again, like we do, to make sure > > there aren't any outstanding scans and it hangs on the wait event. > > it's supposed to be ok to call as many times as you want. > What is NOT allowed is calling it from async work itself, due to the > obvious deadlock. OK, this turns out to be a classic ABBA deadlock. async_synchronize_domain() is one waiter and the scan mutex is the other. What's happening is that scsi_add_device() takes the scan mutex and then waits for the async scan thread to complete. Meanwhile the async thread is dropping and reacquiring the mutex as it moves from scanning to adding devices. Result: Deadlock. I think a reasonable fix is to take the scan mutex *after* waiting for the async scans to complete. An alternative might be to have a version of scsi_scan_host_selected that doesn't take the mutex so we can hold it entirely over async_scsi_scan_host(), but that gets a bit messy. I'll drop the async scan conversion patch until we can get this all sorted out. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html