Hi, On 08/12/11 00:16, Alan Stern wrote: > On Thu, 11 Aug 2011, James Bottomley wrote: >> However, much as I'd like to accept this rosy view, the original oops >> that started all of this in 2.6.38 was someone caught something with a >> reference to a SCSI queue after the device release function had been >> called. > > Not according to your commit log. You wrote that the reference was > taken after scsi_remove_device() had been called -- but the device > release function is scsi_device_dev_release_usercontext(). The commit log of 86cbfb5607d4b81b1a993ff689bbd2addd5d3a9b ("[SCSI] put stricter guards on queue dead checks") does not explain about the move of scsi_free_queue(). But according to the discussion below, it seems the move was motivated to solve the following self-deadlock: https://lkml.org/lkml/2011/4/12/9 [in the context of kblockd_workqueue] blk_delay_work __blk_run_queue scsi_request_fn put_device (puts final sdev refcount) scsi_device_dev_release execute_in_process_context(scsi_device_dev_release_usercontext) [execute immediately because it's in process context] scsi_device_dev_release_usercontext scsi_free_queue blk_cleanup_queue blk_sync_queue (wait for blk_delay_work to complete...) James, is my understanding correct? If so, isn't it possible to move the scsi_free_queue back to the original place and solve the deadlock instead by avoiding the wait in the same context? @@ -338,8 +339,8 @@ static void scsi_device_dev_release_user static void scsi_device_dev_release(struct device *dev) { struct scsi_device *sdp = to_scsi_device(dev); - execute_in_process_context(scsi_device_dev_release_usercontext, - &sdp->ew); + INIT_WORK(&sdp->ew.work, scsi_device_dev_release_usercontext); + schedule_work(&sdp->ew.work); } static struct class sdev_class = { Thanks, -- Jun'ichi Nomura, NEC Corporation -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html