On 07/03/13 19:27, David Dillow wrote:
On Wed, 2013-07-03 at 18:00 +0200, Bart Van Assche wrote:
The combination of dev_loss_tmo off and reconnect_delay > 0 worked fine
in my tests. An I/O failure was detected shortly after the cable to the
target was pulled. I/O resumed shortly after the cable to the target was
reinserted.
Perhaps I don't understand your answer -- I'm asking about dev_loss_tmo
< 0, and fast_io_fail_tmo >= 0. The other transports do not allow this
scenario, and I'm asking if it makes sense for SRP to allow it.
But now that you mention reconnect_delay, what is the meaning of that
when it is negative? That's not in the documentation. And should it be
considered in srp_tmo_valid() -- are there values of reconnect_delay
that cause problems?
None of the combinations that can be configured from user space can
bring the kernel in trouble. If reconnect_delay <= 0 that means that the
time-based reconnect mechanism is disabled.
I'm starting to get a bit concerned about this patch -- can you, Vu, and
Sebastian comment on the testing you have done?
All combinations of reconnect_delay, fast_io_fail_tmo and dev_loss_tmo
that result in different behavior have been tested.
Also, FC caps dev_loss_tmo at SCSI_DEVICE_BLOCK_MAX_TIMEOUT if
fail_io_fast_tmo is off; I agree with your reasoning about leaving it
unlimited if fast fail is on, but does that still hold if it is off?
I think setting dev_loss_tmo to a large value only makes sense if the
value of reconnect_delay is not too large. Setting both to a large value
would result in slow recovery after a transport layer failure has been
corrected.
So you agree it should be capped? I can't tell from your response.
Not all combinations of reconnect_delay / fail_io_fast_tmo /
dev_loss_tmo result in useful behavior. It is up to the user to choose a
meaningful combination.
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html