Re: [PATCH 1/2] nvme: pci: simplify timeout handling

Ming Lei <tom.leiming@xxxxxxxxx> · Sun, 29 Apr 2018 09:36:43 +0800

On Sun, Apr 29, 2018 at 6:27 AM, Ming Lei <tom.leiming@xxxxxxxxx> wrote:
> On Sun, Apr 29, 2018 at 5:57 AM, Ming Lei <tom.leiming@xxxxxxxxx> wrote:
>> On Sat, Apr 28, 2018 at 10:00 PM, jianchao.wang
>> <jianchao.w.wang@xxxxxxxxxx> wrote:
>>> Hi ming
>>>
>>> On 04/27/2018 10:57 PM, Ming Lei wrote:
>>>> I may not understand your point, once blk_sync_queue() returns, the
>>>> timer itself is deactivated, meantime the synced .nvme_timeout() only
>>>> returns EH_NOT_HANDLED before the deactivation.
>>>>
>>>> That means this timer won't be expired any more, so could you explain
>>>> a bit why timeout can come again after blk_sync_queue() returns
>>>
>>> Please consider the following case:
>>>
>>> blk_sync_queue
>>>   -> del_timer_sync
>>>                           blk_mq_timeout_work
>>>                             -> blk_mq_check_expired // return the timeout value
>>>                             -> blk_mq_terninate_expired
>>>                               -> .timeout //return EH_NOT_HANDLED
>>>                             -> mod_timer // setup the timer again based on the result of blk_mq_check_expired
>>>   -> cancel_work_sync
>>> So after the blk_sync_queue, the timer may come back again, then the timeout work.
>>
>> OK, I was trying to avoid to use blk_abort_request(), but looks we
>> may have to depend on it or similar way.
>>
>> BTW, that means blk_sync_queue() has been broken, even though the uses
>> in blk_cleanup_queue().
>>
>> Another approach is to introduce one perpcu_ref of
>> 'q->timeout_usage_counter' for
>> syncing timeout, seems a bit over-kill too, but simpler in both theory
>> and implement.
>
> Or one timout_mutex is enough.

Turns out it is SRCU.

-- 
Ming Lei