Re: fc_remote_port_delete and returning SCSI commands from LLD

James Smart <James.Smart@xxxxxxxxxx> · Wed, 21 Oct 2009 12:33:25 -0400

Here's what I remember about this from the past:

- This was originally added when dealing with older kernels that didn't have 
the eh patch that bounced the timeout handler when the rport was blocked (see 
fc_timed_out).

  The eh patch avoided entering the eh thread upon i/o timeouts if the rport 
was blocked.

- As mentioned in my prior email - there's a window where things can be 
entered before the target blocked state protects you. What if you are in the 
eh_handler when it occurs ?  Unfortunately, the eh thread is very black and 
white on abort/reset/io status - its either success or not. It doesn't 
validate the "not" cases, never looks at retry conditions, and just assumes 
hard failure - which was taking everyone down bad paths.  This is a rats nest 
to resolve right, and I think I mentioned it on the list a long time ago with 
Christoph. Thus the stall was added to plug the hole.

-- james s

Christof Schmitt wrote:
On Tue, Oct 20, 2009 at 04:40:27PM +0200, Christof Schmitt wrote:
If the remote_port status is not BLOCKED, this will trigger the SCSI
midlayer error handling which cannot do much during the interruption
to the hardware and will mark the SCSI devices 'offline'. In order to
prevent this, the rule would be: First call fc_remote_port_delete to
set the remote port (or in the case of an HBA interruption all remote
ports) to BLOCKED, and only after this step call scsi_done to pass the
SCSI commands back to the upper layers.

I just stumbled across a loop that blocks the SCSI error handling
thread:

	spin_lock_irqsave(shost->host_lock, flags);
	while (rport->port_state == FC_PORTSTATE_BLOCKED) {
		spin_unlock_irqrestore(shost->host_lock, flags);
		msleep(1000);
		spin_lock_irqsave(shost->host_lock, flags);
	}
	spin_unlock_irqrestore(shost->host_lock, flags);

This seems to be popular among FC drivers. Is this the preferred way
to synchronize the FC transport class state changes with the SCSI
midlayer error recovery?

Christof
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html