On 06/15/2013 11:52 AM, Bart Van Assche wrote: > On 06/14/13 19:59, Vu Pham wrote: >>> On 06/13/13 21:43, Vu Pham wrote: >>>>> +/** >>>>> + * srp_tmo_valid() - check timeout combination validity >>>>> + * >>>>> + * If no fast I/O fail timeout has been configured then the >>>>> device >>>>> loss timeout >>>>> + * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O >>>>> fail >>>>> timeout has >>>>> + * been configured then it must be below the device loss timeout. >>>>> + */ >>>>> +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo) >>>>> +{ >>>>> + return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo && >>>>> + dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT) >>>>> + || (0 <= fast_io_fail_tmo && >>>>> + (dev_loss_tmo < 0 || >>>>> + (fast_io_fail_tmo < dev_loss_tmo && >>>>> + dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL; >>>>> +} >>>>> +EXPORT_SYMBOL_GPL(srp_tmo_valid); >>>> fast_io_fail_tmo is off, one cannot turn off dev_loss_tmo with >>>> negative value >>>> dev_loss_tmo is off, one cannot turn off fast_io_fail_tmo with >>>> negative value >>> >>> OK, will update the documentation such that it correctly refers to >>> "off" instead of a negative value and I will also mention that >>> dev_loss_tmo can now be disabled. >>> >> It's not only the documentation but also the code logic, you >> cannot turn >> dev_loss_tmo off if fast_io_fail_tmo already turned off and vice >> versa >> with the return statement above. > > Does this mean that you think it would be useful to disable both the > fast_io_fail and the dev_loss mechanisms, and hence rely on the user > to remove remote ports that have disappeared and on the SCSI command > timeout to detect path failures ? I'll start testing this to see > whether that combination does not trigger any adverse behavior. > >>>> If rport's state is already SRP_RPORT_BLOCKED, I don't think we >>>> need >>>> to do extra block with scsi_block_requests() >>> >>> Please keep in mind that srp_reconnect_rport() can be called from >>> two >>> different contexts: that function can not only be called from inside >>> the SRP transport layer but also from inside the SCSI error handler >>> (see also the srp_reset_device() modifications in a later patch in >>> this series). If this function is invoked from the context of the >>> SCSI >>> error handler the chance is high that the SCSI device will have >>> another state than SDEV_BLOCK. Hence the scsi_block_requests() >>> call in >>> this function. >> Yes, srp_reconnect_rport() can be called from two contexts; >> however, it >> deals with same rport & rport's state. >> I'm thinking something like this: >> >> if (rport->state != SRP_RPORT_BLOCKED) { >> scsi_block_requests(shost); > > Sorry but I'm afraid that that approach would still allow the user > to unblock one or more SCSI devices via sysfs during the > i->f->reconnect(rport) call, something we do not want. > >> I think that we can use only the pair >> scsi_block_requests()/scsi_unblock_requests() unless the advantage of >> multipathd recognizing the SDEV_BLOCK is noticeable. > > I think the advantage of multipathd recognizing the SDEV_BLOCK state > before the fast_io_fail_tmo timer has expired is important. > Multipathd does not queue I/O to paths that are in the SDEV_BLOCK > state so setting that state helps I/O to fail over more quickly, > especially for large values of fast_io_fail_tmo. > Sadly it doesn't work that way. SDEV_BLOCK will instruct multipath to not queue _new_ I/Os to the path, but there still will be I/O queued on that path. For these multipath _has_ to wait for I/O completion. And as it turns out, in most cases the application itself will wait for completion on these I/O before continue sending more I/O. So in effect multipath would queue new I/O to other paths, but won't _receive_ new I/O as the upper layers are still waiting for completion of the queued I/O. The only way to excite fast failover with multipathing is to set fast_io_fail to a _LOW_ value (eg 5 seconds), as this will terminate the outstanding I/Os. Large values of fast_io_fail will almost guarantee sluggish I/O failover. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@xxxxxxx +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html