On 05/11/15 11:31, Christoph Hellwig wrote:
On Mon, May 11, 2015 at 10:54:30AM +0200, Bart Van Assche wrote:
Hello Christoph,
There are multiple events that can cause the SRP initiator driver to
initiate a reconnect:
1. The SCSI core invoking eh_host_reset_handler().
2. An error reported by the IB HCA or by the IB core, e.g. an RDMA
transmit timeout or a transport layer disconnect reported by the
IB/CM.
Right, I missed the srp_reconnect_work case. But even with that I
think what I wrote above still stands. srp_reconnect_work in that
case would just directly trigger the abort all commands and
reconnect operation.
The main point I was trying to make is that instead of having a sequence
of:
1) block new queuecommand instances
2) flush out pending queuecommand instances
3) do part of the disconnect
4) fail all in-flight commands
5) reconnect
we should aim for:
1) block new queuecommand instances
2) fail all in-flight commands
3) disconnect and reconnect
to avoid the need to keep track of pending queuecommand instances,
and instead re-use the existing infrastructure to fail all in-flight
commands, which we have the infrastructure for, and which we need
to do anyway.
Hello Christoph,
What I'm wondering about is whether it will be possible with the above
approach to trigger path failover before (2 * SCSI timeout) has expired
? Starting SCSI error handling immediately after the block layer has
reported the first SCSI timeout is only safe if all ongoing SCSI
commands are canceled in some way. Is this what the function
blk_abort_request() is intended for ? As far as I can see invoking that
function or any function with a similar purpose is only safe after the
queuecommand() callback function has finished. However,
blk_mq_run_hw_queue() invokes mq_ops->queue_rq() without holding any
lock. So it's not clear to me how to safely cancel ongoing blk-mq
requests without waiting until these have timed out. I hope that this
means that overlooked something ?
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html