Re: [PATCH 03/11] IB/srp: don't send anything on a bad QP

David Dillow <dillowda@xxxxxxxx> · Mon, 26 Nov 2012 22:31:24 -0500

On Mon, 2012-11-26 at 10:17 +0100, Bart Van Assche wrote:
> On 11/26/12 05:44, David Dillow wrote:
> > Once we know we have an issue with the QP, there is no point trying to
> > send anything else down the pipe. This also allows us to consolidate
> > code in the SCSI EH path.

> After I posted the patch on which the above patch has been based I 
> realized that testing the connection state at the start of 
> srp_send_tsk_mgmt() is not sufficient to avoid QPN use-after-free. If a 
> DREQ is received by the initiator after the above test has been 
> performed and before the task management function has been sent it is 
> still possible to send a task management function over a closed QP.

AFIACT, DREQ does not actually close the QP -- it only tells us that the
other side would like to. We don't actually close the connection until
we try to send on it again, I think -- not sure if we see recv failures
for the queued work items.

Regardless, the issue of resource lifetime is an issue that needs
solving.

>  I'd like to address this in a different way - see also the thread called 
> "SCSI LLDs, the SCSI error handler and host resource lifetime" on the 
> linux-scsi mailing list (November 20, 
> http://marc.info/?t=135342155500003&r=1).

I like the direction you propose there. It seems that scsi_remove_host()
at one point waited for the EH thread to exit -- or perhaps it was part
of scsi_host_put() chain -- as there's the longstanding deferral to the
work queue for the SRP target removal. Of course, that's been there for
~5 years now, and things have changed in the SCSI stack.

-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html