Re: [PATCH 0/5] blk-mq: allow to run queue if queue refcount is held

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 4/1/19 1:16 PM, Ming Lei wrote:
> Hi Dongli,
> 
> On Mon, Apr 01, 2019 at 01:05:46PM +0800, Dongli Zhang wrote:
>>
>>
>> On 4/1/19 10:52 AM, Ming Lei wrote:
>>> On Sun, Mar 31, 2019 at 07:39:17PM -0700, Bart Van Assche wrote:
>>>> On 3/31/19 7:00 PM, Ming Lei wrote:
>>>>> On Sun, Mar 31, 2019 at 08:27:35AM -0700, Bart Van Assche wrote:
>>>>>> I'm not sure the approach of this patch series is really the direction we
>>>>>> should pursue. There are many block driver that free resources immediately
>>>>>
>>>>> Please see scsi_run_queue(), and the queue refcount is always held
>>>>> before run queue.
>>>>
>>>> That's not correct. There is no guarantee that q->q_usage_counter > 0 when
>>>> scsi_run_queue() is called from inside scsi_requeue_run_queue().
>>>
>>> We don't need the guarantee of 'q->q_usage_counter > 0', I mean the
>>> queue's kobj reference counter.
>>>
>>> What we need is to allow run queue to work correctly after queue is frozen
>>> or cleaned up.
>>>
>>>>
>>>>>> I'd like to avoid having to modify all block drivers that free resources
>>>>>> immediately after blk_cleanup_queue() has returned. Have you considered to
>>>>>> modify blk_mq_run_hw_queues() such that it becomes safe to call that
>>>>>> function while blk_cleanup_queue() is in progress, e.g. by inserting a
>>>>>> percpu_ref_tryget_live(&q->q_usage_counter) /
>>>>>> percpu_ref_put(&q->q_usage_counter) pair?
>>>>>
>>>>> It can't work because blk_mq_run_hw_queues may happen after
>>>>> percpu_ref_exit() is done.
>>>>>
>>>>> However, if we move percpu_ref_exit() into queue's release handler, we
>>>>> don't need to grab q->q_usage_counter any more in blk_mq_run_hw_queues(),
>>>>> and we still have to free hw queue resources in queue's release handler,
>>>>> that is exactly what this patchset is doing.
>>>>>
>>>>> In short, getting q->q_usage_counter doesn't make a difference on this
>>>>> issue.
>>>>
>>>> percpu_ref_tryget_live() fails if a per-cpu counter is in the "dead" state.
>>>> percpu_ref_kill() changes the state of a per-cpu counter to the "dead"
>>>> state. blk_freeze_queue_start() calls percpu_ref_kill(). blk_cleanup_queue()
>>>> already calls blk_set_queue_dying() and that last function calls
>>>> blk_freeze_queue_start(). So I think that what you wrote is not correct and
>>>> that inserting a percpu_ref_tryget_live()/percpu_ref_put() pair in
>>>> blk_mq_run_hw_queues() or blk_mq_run_hw_queue() would make a difference and
>>>> also that moving the percpu_ref_exit() call into blk_release_queue() makes
>>>> sense.
>>>
>>> If percpu_ref_exit() is moved to blk_release_queue(), we still need to
>>> move freeing of hw queue's resource into blk_release_queue() like what
>>> the patchset is doing.
>>
>> Hi Ming,
>>
>> Would you mind help explain why we still need to move freeing of hw queue's
>> resource into blk_release_queue() like what the patchset is doing?
>>
>> Let's assume there is no deadlock when percpu_ref_tryget_live() is used,
> 
> Could you explain why the assumption is true?
> 
> We have to run queue after starting to freeze queue for draining
> allocated requests and making forward progress. Inside blk_freeze_queue_start(),
> percpu_ref_kill() marks this ref as DEAD, then percpu_ref_tryget_live() returns
> false, then queue won't be run.

Hi Ming,

I understand the assumption is invalid and there is issue when using
percpu_ref_tryget_live. And I also understand we have to run queue after
starting to freeze queue for draining allocated requests and making forward
progress.


I am just wondering specifically on why "If percpu_ref_exit() is moved to
blk_release_queue(), we still need to move freeing of hw queue's resource into
blk_release_queue() like what the patchset is doing." based on below Bart's
statement:

"percpu_ref_tryget_live() fails if a per-cpu counter is in the "dead" state.
percpu_ref_kill() changes the state of a per-cpu counter to the "dead" state.
blk_freeze_queue_start() calls percpu_ref_kill(). blk_cleanup_queue() already
calls blk_set_queue_dying() and that last function call
blk_freeze_queue_start(). So I think that what you wrote is not correct and that
inserting a percpu_ref_tryget_live()/percpu_ref_put() pair in
blk_mq_run_hw_queues() or blk_mq_run_hw_queue() would make a difference and also
that moving the percpu_ref_exit() call into blk_release_queue() makes sense."

That's is, what is penalty if we do not  move freeing of hw queue's resource
into blk_release_queue() like what the patchset is doing in above situation?

I ask this question just because I would like to better understand the source
code. Does "hw queue's resource" indicate the below?

+        if (hctx->flags & BLK_MQ_F_BLOCKING)
+                cleanup_srcu_struct(hctx->srcu);
+        blk_free_flush_queue(hctx->fq);
+        sbitmap_free(&hctx->ctx_map);

Thank you very much!

Dongli Zhang



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux