On 06/12/2012 17:04, Bart Van Assche wrote:
On 12/06/12 15:27, Or Gerlitz wrote:
The core problem here seems to be that scsi_remove_host simply never
ends.
Hello Or,
The later patches in the srp-ha patch series avoided such behavior by
checking whether the connection between SRP initiator and target is
unique, and by removing duplicate SCSI hosts for which the transport
layer failed. Unfortunately these patches are still under review.
Unless someone can come up with a better solution I will post a patch
one of the next days that makes ib_srp again fail all commands after
host removal started. That will avoid spending a long time doing error
recovery.
Also, you might have noticed that Hannes Reinecke reported a few days
ago that the SCSI error handler may need a lot of time for other
transport types - this behavior is not SRP specific.
I'm not sure what to you exactly refer by duplicated SCSI hosts in this
context or why we have them. Again, at the time we've took the stack
traces snapshot from the system none of the SCSI EH threads was active,
so I'm not sure either your comment about spending long time in the
error recovery flow, as the flow we've run into seems to simply wait
forever.
Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html