On 8/25/20 7:51 PM, Hou Pu wrote: > > > On 2020/8/26 1:29 上午, Josef Bacik wrote: >> On 8/25/20 4:27 AM, Hou Pu wrote: >>> >>> >>> On 2020/8/24 10:02 PM, Josef Bacik wrote: >>>> On 8/23/20 11:23 PM, Hou Pu wrote: >>>>> >>>>> >>>>> On 2020/8/21 9:57 PM, Josef Bacik wrote: >>>>>> On 8/21/20 3:21 AM, Hou Pu wrote: >>>>>>> >>>>>>> >>>>>>> On 2020/8/21 3:03 AM, Josef Bacik wrote: >>>>>>>> On 8/10/20 8:00 AM, Hou Pu wrote: >>>>>>>>> If we configured io timeout of nbd0 to 100s. Later after we >>>>>>>>> finished using it, we configured nbd0 again and set the io >>>>>>>>> timeout to 0. We expect it would timeout after 30 seconds >>>>>>>>> and keep retry. But in fact we could not change the timeout >>>>>>>>> when we set it to 0. the timeout is still the original 100s. >>>>>>>>> >>>>>>>>> So change the timeout to default 30s when we set it to zero. >>>>>>>>> It also behaves same as commit 2da22da57348 ("nbd: fix zero >>>>>>>>> cmd timeout handling v2"). >>>>>>>>> >>>>>>>>> It becomes more important if we were reconfigure a nbd device >>>>>>>>> and the io timeout it set to zero. Because it could take 30s >>>>>>>>> to detect the new socket and thus io could be completed more >>>>>>>>> quickly compared to 100s. >>>>>>>>> >>>>>>>>> Signed-off-by: Hou Pu <houpu@xxxxxxxxxxxxx> >>>>>>>>> --- >>>>>>>>> drivers/block/nbd.c | 2 ++ >>>>>>>>> 1 file changed, 2 insertions(+) >>>>>>>>> >>>>>>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c >>>>>>>>> index ce7e9f223b20..bc9dc1f847e1 100644 >>>>>>>>> --- a/drivers/block/nbd.c >>>>>>>>> +++ b/drivers/block/nbd.c >>>>>>>>> @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct >>>>>>>>> nbd_device *nbd, u64 timeout) >>>>>>>>> nbd->tag_set.timeout = timeout * HZ; >>>>>>>>> if (timeout) >>>>>>>>> blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ); >>>>>>>>> + else >>>>>>>>> + blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ); >>>>>>>>> } >>>>>>>>> /* Must be called with config_lock held */ >>>>>>>>> >>>>>>>> >>>>>>>> What about the tag_set.timeout? Thanks, >>>>>>> >>>>>>> I think user space could set io timeout to 0, thus we set >>>>>>> tag_set.timeout = 0 here and also we should tell the block layer >>>>>>> to restore 30s timeout in case it is not. tag_set.timeout == 0 >>>>>>> imply 30s io timeout and retrying after timeout. >>>>>>> >>>>>>> (Sorry, I am not sure if I understand your question here. Could >>>>>>> you explain a little more if needed?) >>>>>>> >>>>>> >>>>>> I misunderstood what I was using the tagset timeout for. We don't >>>>>> want this here, if we're dropping a config for an nbd device and we >>>>>> want to reset it to defaults then we need to add this to >>>>>> nbd_config_put(). Thanks, >>>>> >>>>> AFAIK If we killed a nbd server, then restarted it and reconfigured >>>>> the nbd socket, I think we might not reconfigure IO timeout to 0 since >>>>> nbd_config_put() is not called in such case. So could we still >>>>> restore default timeout here. Or am I missing something? >>>>> >>>> >>>> If you kill the NBD server then the config is going to be dropped and >>>> need to be reconfigured, so nbd_config_put() will definitely be >>>> called. The only case it wouldn't be is if you are using the netlink >>>> interface, in which case the device is going to keep all of its >>>> original settings. Are you not seeing the final nbd_config_put() >>>> being done when you kill the nbd server? That seems like a bug if >>>> not, and that should be fixed, and then this timeout thing going in >>>> there will fix your issue. Thanks, >>> >>> I was using the netlink interface. So I could use the reconnect >>> feature to update the nbd server without impacting the user of >>> nbd device. >>> >>> I did not see the final nbd_config_put() when I killed the nbd server. >>> After I killed the nbd server, the recv_work() put 1 config_ref. >>> Another ref count is still held by nbd_genl_connect(). I thought it >>> was as expected. >>> >>> Beside in nbd_genl_reconfigure(), it is checked nbd->config_refs should >>> not be zero by: >>> if (!refcount_inc_not_zero(&nbd->config_refs)) { >>> dev_err(nbd_to_dev(nbd), >>> "not configured, cannot reconfigure\n"); >>> nbd_put(nbd); >>> return -EINVAL; >>> } >>> So AFAIK this behavior is as expected. >> >> Ahh ok I see what you're getting at. Ok I agree, you can add >> >> Reviewed-by: Josef Bacik <josef@xxxxxxxxxxxxxx> > > Thanks for your review, Applied, thanks. -- Jens Axboe