Re: [PATCH 1/2] nbd: enable replace socket if only one connection is configured

Mike Christie <mchristi@xxxxxxxxxx> · Tue, 3 Mar 2020 15:48:47 -0600

On 03/03/2020 03:12 PM, Josef Bacik wrote:
> On 2/28/20 1:40 AM, Hou Pu wrote:
>> Nbd server with multiple connections could be upgraded since
>> 560bc4b (nbd: handle dead connections). But if only one conncection
>> is configured, after we take down nbd server, all inflight IO
>> would finally timeout and return error. We could requeue them
>> like what we do with multiple connections and wait for new socket
>> in submit path.
>>
>> Signed-off-by: Hou Pu <houpu@xxxxxxxxxxxxx>
>> ---
>>   drivers/block/nbd.c | 17 +++++++++--------
>>   1 file changed, 9 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>> index 78181908f0df..83070714888b 100644
>> --- a/drivers/block/nbd.c
>> +++ b/drivers/block/nbd.c
>> @@ -395,16 +395,19 @@ static enum blk_eh_timer_return
>> nbd_xmit_timeout(struct request *req,
>>       }
>>       config = nbd->config;
>>   -    if (config->num_connections > 1) {
>> +    if (config->num_connections > 1 ||
>> +        (config->num_connections == 1 && nbd->tag_set.timeout)) {
> 
> This is every connection, do you mean to couple this with
> dead_conn_timeout? Thanks,
> 

In

commit 2da22da573481cc4837e246d0eee4d518b3f715e
Author: Mike Christie <mchristi@xxxxxxxxxx>
Date:   Tue Aug 13 11:39:52 2019 -0500

    nbd: fix zero cmd timeout handling v2

we can set tag_set.timeout=0 again.

So if timeout != 0 and num_connections = 1, we requeue here and let
nbd_handle_cmd->wait_for_reconnect decide to wait or fail the command if
dead_conn_timeout is not set.

If timeout = 0, then we give it more time because it might have just
been a slow server or connection. I think this behavior is wrong for the
case Hou is fixing. See comment in the next patch.