Re: [PATCH 0/5] blk-mq: allow to run queue if queue refcount is held

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Mar 31, 2019 at 07:39:17PM -0700, Bart Van Assche wrote:
> On 3/31/19 7:00 PM, Ming Lei wrote:
> > On Sun, Mar 31, 2019 at 08:27:35AM -0700, Bart Van Assche wrote:
> > > I'm not sure the approach of this patch series is really the direction we
> > > should pursue. There are many block driver that free resources immediately
> > 
> > Please see scsi_run_queue(), and the queue refcount is always held
> > before run queue.
> 
> That's not correct. There is no guarantee that q->q_usage_counter > 0 when
> scsi_run_queue() is called from inside scsi_requeue_run_queue().
> 
> > > I'd like to avoid having to modify all block drivers that free resources
> > > immediately after blk_cleanup_queue() has returned. Have you considered to
> > > modify blk_mq_run_hw_queues() such that it becomes safe to call that
> > > function while blk_cleanup_queue() is in progress, e.g. by inserting a
> > > percpu_ref_tryget_live(&q->q_usage_counter) /
> > > percpu_ref_put(&q->q_usage_counter) pair?
> > 
> > It can't work because blk_mq_run_hw_queues may happen after
> > percpu_ref_exit() is done.
> > 
> > However, if we move percpu_ref_exit() into queue's release handler, we
> > don't need to grab q->q_usage_counter any more in blk_mq_run_hw_queues(),
> > and we still have to free hw queue resources in queue's release handler,
> > that is exactly what this patchset is doing.
> > 
> > In short, getting q->q_usage_counter doesn't make a difference on this
> > issue.
> 
> percpu_ref_tryget_live() fails if a per-cpu counter is in the "dead" state.
> percpu_ref_kill() changes the state of a per-cpu counter to the "dead"
> state. blk_freeze_queue_start() calls percpu_ref_kill(). blk_cleanup_queue()
> already calls blk_set_queue_dying() and that last function calls
> blk_freeze_queue_start(). So I think that what you wrote is not correct and
> that inserting a percpu_ref_tryget_live()/percpu_ref_put() pair in
> blk_mq_run_hw_queues() or blk_mq_run_hw_queue() would make a difference and
> also that moving the percpu_ref_exit() call into blk_release_queue() makes
> sense.

This way is easy to cause deadlock!!!

If percpu_ref_tryget_live() is called in the entry of blk_mq_run_hw_queues(), 
at the same time, blk_freeze_queue_start() is called, then percpu_ref_tryget_live()
will fail, and run queue can't move on, then blk_mq_freeze_queue_wait() will wait
forever.

Thanks,
Ming



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux