Re: UAF in blk_add_rq_to_plug()?

Al Viro <viro@xxxxxxxxxxxxxxxxxx> · Mon, 31 Oct 2022 23:35:24 +0000

On Mon, Oct 31, 2022 at 04:42:11PM -0600, Jens Axboe wrote:
> On 10/31/22 4:12 PM, Al Viro wrote:
> > static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq)
> > {
> >         struct request *last = rq_list_peek(&plug->mq_list);
> > 
> > Suppose it's not NULL...
> > 
> >         if (!plug->rq_count) {
> >                 trace_block_plug(rq->q);
> >         } else if (plug->rq_count >= blk_plug_max_rq_count(plug) ||
> >                    (!blk_queue_nomerges(rq->q) &&
> >                     blk_rq_bytes(last) >= BLK_PLUG_FLUSH_SIZE)) {
> > ... and we went here:
> >                 blk_mq_flush_plug_list(plug, false);
> > All requests, including the one last points to, might get fed ->queue_rq()
> > here.  At which point there seems to be nothing to prevent them getting
> > completed and freed on another CPU, possibly before we return here.
> > 
> >                 trace_block_plug(rq->q);
> >         }
> > 
> >         if (!plug->multiple_queues && last && last->q != rq->q)
> > ... and here we dereference last.
> > 
> > Shouldn't we reset last to NULL after the call of blk_mq_flush_plug_list()
> > above?
> 
> There's no UAF here as the requests aren't freed. We could clear 'last'
> to make the code more explicit, and that would avoid any potential
> suboptimal behavior with ->multiple_queues being wrong.

Umm...
Suppose ->has_elevator is false and so's ->multiple_queues.
No ->queue_rqs(), so blk_mq_flush_plug_list() grabs rcu_read_lock() and
hit blk_mq_plug_issue_direct().
blk_mq_plug_issue_direct() picks the first request off the list
and passes it to blk_mq_request_issue_directly(), which passes it
to __blk_mq_request_issue_directly().  There we grab a tag and
proceed to __blk_mq_issue_directly(), which feeds request to ->queue_rq().

What's to stop e.g. worker on another CPU from picking that sucker,
completing it and calling blk_mq_end_request() which completes all
bio involved and calls blk_mq_free_request()?

If all of that manages to happen before blk_mq_flush_plug_list()
returns to caller...  Sure, you probably won't hit in on bare
metal, but if you are in a KVM and this virtual CPU happens to
lose the host timeslice... I've seen considerably more narrow
race windows getting hit on such setups.

Am I missing something subtle here?  It's been a long time since
I've read through that area - as the matter of fact, I'm trying
to refresh my memories of the bio_submit()-related code paths
at the moment...