Re: [PATCH v5 3/3] blk-mq: Fix a race between iterating over requests and freeing requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Apr 4, 2021 at 5:28 PM Bart Van Assche <bvanassche@xxxxxxx> wrote:
> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> index 116c3691b104..e7a6a114d216 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -209,7 +209,11 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
>
>         if (!reserved)
>                 bitnr += tags->nr_reserved_tags;
> -       rq = tags->rqs[bitnr];
> +       /*
> +        * Protected by rq->q->q_usage_counter. See also
> +        * blk_mq_queue_tag_busy_iter().
> +        */
> +       rq = rcu_dereference_check(tags->rqs[bitnr], true);

maybe I'm missing something, but if this tags struct has a shared
sbitmap, what guarantees do we have that while iterating we won't
touch requests from a queue that's tearing down? The check a few lines
below suggests that at the least we may touch requests from a
different queue.

say we enter blk_mq_queue_tag_busy_iter, we're iterating with raised
hctx->q->q_usage_counter, and get to bt_iter

say tagset has 2 shared queues, hctx->q is q1, rq->q is q2
(thread 1)
rq = rcu_deref_check
(rq->q != hctx->q, but we don't know yet)

(thread 2)
elsewhere, blk_cleanup_queue(q2) runs (or elevator_exit), since we
only have raised q_usage_counter on q1, this goes to completion and
frees rq. if we have preempt kernel, thread 1 may be paused before we
read rq->q, so synchonrize_rcu passes happily by

(thread 1)
we check rq && rq->q == hctx->q, use-after-free since rq was freed above

I think John Garry mentioned observing a similar race in patch 2 of
his series, perhaps his test case can verify this?

"Indeed, blk_mq_queue_tag_busy_iter() already does take a reference to its
queue usage counter when called, and the queue cannot be frozen to switch
IO scheduler until all refs are dropped. This ensures no stale references
to IO scheduler requests will be seen by blk_mq_queue_tag_busy_iter().

However, there is nothing to stop blk_mq_queue_tag_busy_iter() being
run for another queue associated with the same tagset, and it seeing
a stale IO scheduler request from the other queue after they are freed."

so, to my understanding, we have a race between reading
tags->rq[bitnr], and verifying that rq->q == hctx->q, where if we
schedule off we might have use-after-free? If that's the case, perhaps
we should rcu_read_lock until we verify the queues match, then we can
release and run fn(), as we verified we no longer need it?

>
>         /*
>          * We can hit rq == NULL here, because the tagging functions
> @@ -254,11 +258,17 @@ struct bt_tags_iter_data {
>         unsigned int flags;
>  };
>

Thanks,
Khazhy

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux