On Mon, 2012-11-26 at 10:17 +0100, Bart Van Assche wrote: > On 11/26/12 05:44, David Dillow wrote: > > Once we know we have an issue with the QP, there is no point trying to > > send anything else down the pipe. This also allows us to consolidate > > code in the SCSI EH path. > After I posted the patch on which the above patch has been based I > realized that testing the connection state at the start of > srp_send_tsk_mgmt() is not sufficient to avoid QPN use-after-free. If a > DREQ is received by the initiator after the above test has been > performed and before the task management function has been sent it is > still possible to send a task management function over a closed QP. AFIACT, DREQ does not actually close the QP -- it only tells us that the other side would like to. We don't actually close the connection until we try to send on it again, I think -- not sure if we see recv failures for the queued work items. Regardless, the issue of resource lifetime is an issue that needs solving. > I'd like to address this in a different way - see also the thread called > "SCSI LLDs, the SCSI error handler and host resource lifetime" on the > linux-scsi mailing list (November 20, > http://marc.info/?t=135342155500003&r=1). I like the direction you propose there. It seems that scsi_remove_host() at one point waited for the EH thread to exit -- or perhaps it was part of scsi_host_put() chain -- as there's the longstanding deferral to the work queue for the SRP target removal. Of course, that's been there for ~5 years now, and things have changed in the SCSI stack. -- Dave Dillow National Center for Computational Science Oak Ridge National Laboratory (865) 241-6602 office -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html