Re: [LSF/MM TOPIC] SCSI Error Handling and HBA Recovery

Benjamin Block <bblock@xxxxxxxxxxxxx> · Thu, 24 Jan 2019 09:16:41 +0100

On Wed, Jan 23, 2019 at 04:46:17PM -0800, Bart Van Assche wrote:
> Several SCSI low-level drivers need to suspend .queuecommand() calls while
> HBA or transport layer recovery happens. The iSCSI and SRP initiator drivers
> use scsi_target_block() to block new .queuecommand() calls while recovery
> happens. scsi_target_block() prevents that the block layer core triggers new
> .queuecommand() calls but does not prevent that the SCSI error handler calls
> .queuecommand(). SCSI LLD authors have the choice of either hoping that
> .queuecommand() calls from the SCSI error handler won't happen while transport
> layer recovery is in progress or to add code in the .queuecommand() function
> that detects from which context that call comes and to delay such
> .queuecommand() calls. In the SRP initiator driver that code looks as follows:
> 
> 	const bool in_scsi_eh = !in_interrupt() && current == shost->ehandler;
> 
> 	/*
> 	 * The SCSI EH thread is the only context from which srp_queuecommand()
> 	 * can get invoked for blocked devices (SDEV_BLOCK /
> 	 * SDEV_CREATED_BLOCK). Avoid racing with srp_reconnect_rport() by
> 	 * locking the rport mutex if invoked from inside the SCSI EH.
> 	 */
> 	if (in_scsi_eh)
> 		mutex_lock(&rport->mutex);
> 
> In my opinion the SCSI core should make it easy for LLD authors to prevent that
> the error handler calls .queuecommand() while transport layer recovery is in
> progress. So considerable time ago I posted several patches that modify the SCSI
> error handler and that avoid that SCSI LLDs have to detect the context a
> .queuecommand() call comes from. None of these patches were accepted and no 
> alternative approach was proposed. Hence the proposal to discuss this topic in
> person during LSF/MM.
> 
> See also "[PATCH 1/2] RDMA/srp: Avoid calling mutex_lock() from inside
> scsi_queue_rq()" (https://www.spinics.net/lists/linux-rdma/msg73842.html).
> 

Having SCSI EH run while transport recovery is running for the same
context is a bit of a pain in general. I remember having seen situations
like this with zFCP once or twice (~2 years ago). Especially when SCSI
EH tries to unblock commands on the same context that is just going
through transport recovery..

So e.g. EH wants to send a TUR (in EH) to a rport for which we just now
do recovery for, then EH will fail, because we can't physically service
that TUR right then, and EH will escalate, possibly with bad timing till
it forces us through adapter recovery, which then faults all other
rports as well.

Having some more coordination here would be good.

-- 
With Best Regards, Benjamin Block      /      Linux on IBM Z Kernel Development
IBM Systems & Technology Group   /  IBM Deutschland Research & Development GmbH
Vorsitz. AufsR.: Matthias Hartmann       /      Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294