Re: [PATCH RFC] blk-mq: fix potential uaf for 'queue_hw_ctx'

Ming Lei <ming.lei@xxxxxxxxxx> · Thu, 24 Feb 2022 10:15:54 +0800

On Thu, Feb 24, 2022 at 09:29:09AM +0800, yukuai (C) wrote:
> 在 2022/02/23 22:30, Ming Lei 写道:
> > On Wed, Feb 23, 2022 at 07:26:01PM +0800, Yu Kuai wrote:
> > > blk_mq_realloc_hw_ctxs() will free the 'queue_hw_ctx'(e.g. undate
> > > submit_queues through configfs for null_blk), while it might still be
> > > used from other context(e.g. switch elevator to none):
> > > 
> > > t1					t2
> > > elevator_switch
> > >   blk_mq_unquiesce_queue
> > >    blk_mq_run_hw_queues
> > >     queue_for_each_hw_ctx
> > >      // assembly code for hctx = (q)->queue_hw_ctx[i]
> > >      mov    0x48(%rbp),%rdx -> read old queue_hw_ctx
> > > 
> > > 					__blk_mq_update_nr_hw_queues
> > > 					 blk_mq_realloc_hw_ctxs
> > > 					  hctxs = q->queue_hw_ctx
> > > 					  q->queue_hw_ctx = new_hctxs
> > > 					  kfree(hctxs)
> > >      movslq %ebx,%rax
> > >      mov    (%rdx,%rax,8),%rdi ->uaf
> > > 
> > 
> > Not only uaf on queue_hw_ctx, but also other similar issue on other
> > structures, and I think the correct and easy fix is to quiesce request
> > queue during updating nr_hw_queues, something like the following patch:
> > 
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index a05ce7725031..d8e7c3cce0dd 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -4467,8 +4467,10 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
> >   	if (set->nr_maps == 1 && nr_hw_queues == set->nr_hw_queues)
> >   		return;
> > -	list_for_each_entry(q, &set->tag_list, tag_set_list)
> > +	list_for_each_entry(q, &set->tag_list, tag_set_list) {
> >   		blk_mq_freeze_queue(q);
> > +		blk_mq_quiesce_queue(q);
> > +	}
> >   	/*
> >   	 * Switch IO scheduler to 'none', cleaning up the data associated
> >   	 * with the previous scheduler. We will switch back once we are done
> > @@ -4518,8 +4520,10 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
> >   	list_for_each_entry(q, &set->tag_list, tag_set_list)
> >   		blk_mq_elv_switch_back(&head, q);
> > -	list_for_each_entry(q, &set->tag_list, tag_set_list)
> > +	list_for_each_entry(q, &set->tag_list, tag_set_list) {
> > +		blk_mq_unquiesce_queue(q);
> >   		blk_mq_unfreeze_queue(q);
> > +	}
> >   }
> >   void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues)
> Hi, Ming
> 
> If blk_mq_quiesce_queue() is called from __blk_mq_update_nr_hw_queues()
> first, and then swithing elevator to none won't trigger the problem.
> However, what if blk_mq_unquiesce_queue() from switching elevator
> decrease quiesce_depth to 0 first, and then blk_mq_quiesce_queue() is
> called from __blk_mq_update_nr_hw_queues(), it seems to me such
> concurrent scenarios still exist.

No, the scenario won't exist, once blk_mq_quiesce_queue() returns, it is
guaranteed that:

- in-progress run queue is drained
- no new run queue can be started

Thanks,
Ming