Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/14/13 19:59, Vu Pham wrote:
On 06/13/13 21:43, Vu Pham wrote:
+/**
+ * srp_tmo_valid() - check timeout combination validity
+ *
+ * If no fast I/O fail timeout has been configured then the device
loss timeout
+ * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O fail
timeout has
+ * been configured then it must be below the device loss timeout.
+ */
+int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo)
+{
+    return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo &&
+        dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
+        || (0 <= fast_io_fail_tmo &&
+            (dev_loss_tmo < 0 ||
+             (fast_io_fail_tmo < dev_loss_tmo &&
+              dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL;
+}
+EXPORT_SYMBOL_GPL(srp_tmo_valid);
fast_io_fail_tmo is off, one cannot turn off dev_loss_tmo with
negative value
dev_loss_tmo is off, one cannot turn off fast_io_fail_tmo with
negative value

OK, will update the documentation such that it correctly refers to
"off" instead of a negative value and I will also mention that
dev_loss_tmo can now be disabled.

It's not only the documentation but also the code logic, you cannot turn
dev_loss_tmo off if fast_io_fail_tmo already turned off and vice versa
with the return statement above.

Does this mean that you think it would be useful to disable both the fast_io_fail and the dev_loss mechanisms, and hence rely on the user to remove remote ports that have disappeared and on the SCSI command timeout to detect path failures ? I'll start testing this to see whether that combination does not trigger any adverse behavior.

If rport's state is already SRP_RPORT_BLOCKED, I don't think we need
to do extra block with scsi_block_requests()

Please keep in mind that srp_reconnect_rport() can be called from two
different contexts: that function can not only be called from inside
the SRP transport layer but also from inside the SCSI error handler
(see also the srp_reset_device() modifications in a later patch in
this series). If this function is invoked from the context of the SCSI
error handler the chance is high that the SCSI device will have
another state than SDEV_BLOCK. Hence the scsi_block_requests() call in
this function.
Yes, srp_reconnect_rport() can be called from two contexts; however, it
deals with same rport & rport's state.
I'm thinking something like this:

    if (rport->state != SRP_RPORT_BLOCKED) {
     scsi_block_requests(shost);

Sorry but I'm afraid that that approach would still allow the user to unblock one or more SCSI devices via sysfs during the i->f->reconnect(rport) call, something we do not want.

I think that we can use only the pair
scsi_block_requests()/scsi_unblock_requests() unless the advantage of
multipathd recognizing the SDEV_BLOCK is noticeable.

I think the advantage of multipathd recognizing the SDEV_BLOCK state before the fast_io_fail_tmo timer has expired is important. Multipathd does not queue I/O to paths that are in the SDEV_BLOCK state so setting that state helps I/O to fail over more quickly, especially for large values of fast_io_fail_tmo.

Hope this helps,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux