Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling

Hannes Reinecke <hare@xxxxxxx> · Mon, 17 Jun 2013 08:18:40 +0200

On 06/15/2013 11:52 AM, Bart Van Assche wrote:
> On 06/14/13 19:59, Vu Pham wrote:
>>> On 06/13/13 21:43, Vu Pham wrote:
>>>>> +/**
>>>>> + * srp_tmo_valid() - check timeout combination validity
>>>>> + *
>>>>> + * If no fast I/O fail timeout has been configured then the
>>>>> device
>>>>> loss timeout
>>>>> + * must be below SCSI_DEVICE_BLOCK_MAX_TIMEOUT. If a fast I/O
>>>>> fail
>>>>> timeout has
>>>>> + * been configured then it must be below the device loss timeout.
>>>>> + */
>>>>> +int srp_tmo_valid(int fast_io_fail_tmo, int dev_loss_tmo)
>>>>> +{
>>>>> +    return (fast_io_fail_tmo < 0 && 1 <= dev_loss_tmo &&
>>>>> +        dev_loss_tmo <= SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
>>>>> +        || (0 <= fast_io_fail_tmo &&
>>>>> +            (dev_loss_tmo < 0 ||
>>>>> +             (fast_io_fail_tmo < dev_loss_tmo &&
>>>>> +              dev_loss_tmo < LONG_MAX / HZ))) ? 0 : -EINVAL;
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(srp_tmo_valid);
>>>> fast_io_fail_tmo is off, one cannot turn off dev_loss_tmo with
>>>> negative value
>>>> dev_loss_tmo is off, one cannot turn off fast_io_fail_tmo with
>>>> negative value
>>>
>>> OK, will update the documentation such that it correctly refers to
>>> "off" instead of a negative value and I will also mention that
>>> dev_loss_tmo can now be disabled.
>>>
>> It's not only the documentation but also the code logic, you
>> cannot turn
>> dev_loss_tmo off if fast_io_fail_tmo already turned off and vice
>> versa
>> with the return statement above.
> 
> Does this mean that you think it would be useful to disable both the
> fast_io_fail and the dev_loss mechanisms, and hence rely on the user
> to remove remote ports that have disappeared and on the SCSI command
> timeout to detect path failures ? I'll start testing this to see
> whether that combination does not trigger any adverse behavior.
> 
>>>> If rport's state is already SRP_RPORT_BLOCKED, I don't think we
>>>> need
>>>> to do extra block with scsi_block_requests()
>>>
>>> Please keep in mind that srp_reconnect_rport() can be called from
>>> two
>>> different contexts: that function can not only be called from inside
>>> the SRP transport layer but also from inside the SCSI error handler
>>> (see also the srp_reset_device() modifications in a later patch in
>>> this series). If this function is invoked from the context of the
>>> SCSI
>>> error handler the chance is high that the SCSI device will have
>>> another state than SDEV_BLOCK. Hence the scsi_block_requests()
>>> call in
>>> this function.
>> Yes, srp_reconnect_rport() can be called from two contexts;
>> however, it
>> deals with same rport & rport's state.
>> I'm thinking something like this:
>>
>>     if (rport->state != SRP_RPORT_BLOCKED) {
>>      scsi_block_requests(shost);
> 
> Sorry but I'm afraid that that approach would still allow the user
> to unblock one or more SCSI devices via sysfs during the
> i->f->reconnect(rport) call, something we do not want.
> 
>> I think that we can use only the pair
>> scsi_block_requests()/scsi_unblock_requests() unless the advantage of
>> multipathd recognizing the SDEV_BLOCK is noticeable.
> 
> I think the advantage of multipathd recognizing the SDEV_BLOCK state
> before the fast_io_fail_tmo timer has expired is important.
> Multipathd does not queue I/O to paths that are in the SDEV_BLOCK
> state so setting that state helps I/O to fail over more quickly,
> especially for large values of fast_io_fail_tmo.
> 
Sadly it doesn't work that way.

SDEV_BLOCK will instruct multipath to not queue _new_ I/Os to the
path, but there still will be I/O queued on that path.
For these multipath _has_ to wait for I/O completion.
And as it turns out, in most cases the application itself will wait
for completion on these I/O before continue sending more I/O.
So in effect multipath would queue new I/O to other paths, but won't
_receive_ new I/O as the upper layers are still waiting for
completion of the queued I/O.

The only way to excite fast failover with multipathing is to set
fast_io_fail to a _LOW_ value (eg 5 seconds), as this will terminate
the outstanding I/Os.

Large values of fast_io_fail will almost guarantee sluggish I/O
failover.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@xxxxxxx			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html