Re: [PATCH] block: Fix a race between blk_cleanup_queue() and timeout handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/30/2017 12:37 PM, Bart Van Assche wrote:
> On Mon, 2017-10-30 at 12:16 -0600, Jens Axboe wrote:
>> On 10/19/2017 11:00 AM, Bart Van Assche wrote:
>>> Make sure that if the timeout timer fires after a queue has been
>>> marked "dying" that the affected requests are finished.
>>>
>>> Reported-by: chenxiang (M) <chenxiang66@xxxxxxxxxxxxx>
>>> Fixes: commit 287922eb0b18 ("block: defer timeouts to a workqueue")
>>> Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxx>
>>> Tested-by: chenxiang (M) <chenxiang66@xxxxxxxxxxxxx>
>>> Cc: Christoph Hellwig <hch@xxxxxx>
>>> Cc: Keith Busch <keith.busch@xxxxxxxxx>
>>> Cc: Hannes Reinecke <hare@xxxxxxxx>
>>> Cc: Ming Lei <ming.lei@xxxxxxxxxx>
>>> Cc: Johannes Thumshirn <jthumshirn@xxxxxxx>
>>> Cc: <stable@xxxxxxxxxxxxxxx>
>>> ---
>>>  block/blk-core.c    | 2 ++
>>>  block/blk-timeout.c | 3 ---
>>>  2 files changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>> index e8e149ccc86b..bb4fce694a60 100644
>>> --- a/block/blk-core.c
>>> +++ b/block/blk-core.c
>>> @@ -333,6 +333,7 @@ EXPORT_SYMBOL(blk_stop_queue);
>>>  void blk_sync_queue(struct request_queue *q)
>>>  {
>>>  	del_timer_sync(&q->timeout);
>>> +	cancel_work_sync(&q->timeout_work);
>>>  
>>>  	if (q->mq_ops) {
>>>  		struct blk_mq_hw_ctx *hctx;
>>> @@ -844,6 +845,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
>>>  	setup_timer(&q->backing_dev_info->laptop_mode_wb_timer,
>>>  		    laptop_mode_timer_fn, (unsigned long) q);
>>>  	setup_timer(&q->timeout, blk_rq_timed_out_timer, (unsigned long) q);
>>> +	INIT_WORK(&q->timeout_work, NULL);
>>>  	INIT_LIST_HEAD(&q->queue_head);
>>>  	INIT_LIST_HEAD(&q->timeout_list);
>>>  	INIT_LIST_HEAD(&q->icq_list);
>>
>> This part looks like a no-brainer.
>>
>>> diff --git a/block/blk-timeout.c b/block/blk-timeout.c
>>> index e3e9c9771d36..764ecf9aeb30 100644
>>> --- a/block/blk-timeout.c
>>> +++ b/block/blk-timeout.c
>>> @@ -134,8 +134,6 @@ void blk_timeout_work(struct work_struct *work)
>>>  	struct request *rq, *tmp;
>>>  	int next_set = 0;
>>>  
>>> -	if (blk_queue_enter(q, true))
>>> -		return;
>>>  	spin_lock_irqsave(q->queue_lock, flags);
>>>  
>>>  	list_for_each_entry_safe(rq, tmp, &q->timeout_list, timeout_list)
>>> @@ -145,7 +143,6 @@ void blk_timeout_work(struct work_struct *work)
>>>  		mod_timer(&q->timeout, round_jiffies_up(next));
>>>  
>>>  	spin_unlock_irqrestore(q->queue_lock, flags);
>>> -	blk_queue_exit(q);
>>>  }
>>
>> And this should be fine too, if we have requests that timeout (as they
>> hold a queue enter reference). Is it safe if there are no requests left?
>> Previously we would fail the enter if the queue was dying, now we won't.
>>
>> Doesn't look required for the change, should probably be a separate
>> patch.
> 
> Hello Jens,
> 
> This patch was developed as follows:
> - In order to avoid that request timeouts do not get processed, I removed
>   the blk_queue_enter() and blk_queue_exit() calls from blk_timeout_work().
> - To avoid that calling blk_timeout_work() while a queue is dying would
>   cause trouble, I added the cancel_work_sync() call in blk_sync_queue().
> - After I noticed that the cancel_work_sync() call added by this patch
>   triggers a kernel warning when unloading the brd driver I added the
>   INIT_WORK() call.
> 
> I hope this makes it clear that removing the blk_queue_enter() and
> blk_queue_exit() calls is essential and also that all changes in this patch
> are necessary.

2/3 was the first part that I have absolutely no concerns with, looks
fine. For #1, I guess the issue is exactly that if the queue is dying
AND we have requests pending for timeout, we can't let that make us
fail to enter the timeout handling. With the sync of the workqueue
timeout handler, this does look safe.

I'll apply it for 4.15, thanks Bart.

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux