Re: [PATCH 0/5] blk-mq: allow to run queue if queue refcount is held

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Mar 31, 2019 at 08:27:35AM -0700, Bart Van Assche wrote:
> On 3/30/19 8:09 PM, Ming Lei wrote:
> > Since 45a9c9d909b2 ("blk-mq: Fix a use-after-free"), run queue isn't
> > allowed during cleanup queue even though queue refcount is held.
> > 
> > This change has caused lots of kernel oops triggered in run queue path,
> > turns out it isn't easy to fix them all.
> > 
> > So move freeing of hw queue resources into queue's release handler, then
> > the above issue is fixed. Meantime, this way is safe given freeing hw
> > queue resource doesn't require to use tags.
> 
> I'm not sure the approach of this patch series is really the direction we
> should pursue. There are many block driver that free resources immediately

Please see scsi_run_queue(), and the queue refcount is always held
before run queue. That said this way has been working well for long time.
However, 45a9c9d909b2 ("blk-mq: Fix a use-after-free") breaks this behaviour,
and causes kernel oops.

> after blk_cleanup_queue() returns. An example from the brd driver:
> 
> static void brd_free(struct brd_device *brd)
> {
> 	put_disk(brd->brd_disk);
> 	blk_cleanup_queue(brd->brd_queue);
> 	brd_free_pages(brd);
> 	kfree(brd);
> }

Once blk_cleanup_queue() is returned, especially queue is frozen, there
isn't any request to be queued to driver, so it isn't a problem wrt.
freeing driver resource because we won't call into driver any more
under this situation.

This patch fixes the race between cleanup queue and run queue:

1) run queue may happen before the queue is frozen, however blk_mq_run_hw_queues()
may not be done when blk_freeze_queue() returns.

2) run queue may happen after blk_freeze_queue() returns

> 
> I'd like to avoid having to modify all block drivers that free resources
> immediately after blk_cleanup_queue() has returned. Have you considered to
> modify blk_mq_run_hw_queues() such that it becomes safe to call that
> function while blk_cleanup_queue() is in progress, e.g. by inserting a
> percpu_ref_tryget_live(&q->q_usage_counter) /
> percpu_ref_put(&q->q_usage_counter) pair?

It can't work because blk_mq_run_hw_queues may happen after
percpu_ref_exit() is done.

However, if we move percpu_ref_exit() into queue's release handler, we
don't need to grab q->q_usage_counter any more in blk_mq_run_hw_queues(),
and we still have to free hw queue resources in queue's release handler,
that is exactly what this patchset is doing.

In short, getting q->q_usage_counter doesn't make a difference on this
issue.

Thanks,
Ming



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux