Re: [RFC PATCH 1/2] blk-mq: test tags bitmap before get request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/1/21 4:49 AM, Bart Van Assche wrote:
On 2/28/21 6:14 PM, Yufen Yu wrote:
For now, we set hctx->tags->rqs[i] when get driver tag successfully.
The request either comes from sched_tags->static_rqs[] with scheduler,
or comes from tags->static_rqs[] with no scheduler. But, the value won't
be clear when put driver tag. Thus, tags->rqs[i] still remain old request.

We can free these sched_tags->static_rqs[] requests when switch elevator,
update nr_requests or update nr_hw_queues. After that, unexpected access
of tags->rqs[i] may cause use-after-free crash.

For example, we reported use-after-free of request in nbd device
by syzkaller:

BUG: KASAN: use-after-free in blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
Read of size 4 at addr ffff80036b77f9d4 by task kworker/u9:0/10086
Call trace:
  dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
  show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x144/0x1b4 lib/dump_stack.c:118
  print_address_description+0x68/0x2d0 mm/kasan/report.c:253
  kasan_report_error mm/kasan/report.c:351 [inline]
  kasan_report+0x134/0x2f0 mm/kasan/report.c:409
  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
  __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
  __read_once_size include/linux/compiler.h:193 [inline]
  blk_mq_rq_state block/blk-mq.h:106 [inline]
  blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
  nbd_read_stat drivers/block/nbd.c:670 [inline]
  recv_work+0x1bc/0x890 drivers/block/nbd.c:749
  process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2156
  worker_thread+0x80/0x9d0 kernel/workqueue.c:2311
  kthread+0x1d8/0x1e0 kernel/kthread.c:255
  ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174

The syzkaller test program sended a reply package to client
without client sending request. After receiving the package,
recv_work() try to get the remained request in tags->rqs[]
by tag, which have been free.

To avoid this type of problem, we may need to ensure the request
valid when get it by tag.

Signed-off-by: Yufen Yu <yuyufen@xxxxxxxxxx>
---
  block/blk-mq.c | 10 +++++++++-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index d4d7c1caa439..5362a7958b74 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -836,9 +836,17 @@ void blk_mq_delay_kick_requeue_list(struct request_queue *q,
  }
  EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
+static int blk_mq_test_tag_bit(struct blk_mq_tags *tags, unsigned int tag)
+{
+	if (!blk_mq_tag_is_reserved(tags, tag))
+		return sbitmap_test_bit(&tags->bitmap_tags->sb, tag);
+	else
+		return sbitmap_test_bit(&tags->breserved_tags->sb, tag);
+}
+
  struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
  {
-	if (tag < tags->nr_tags) {
+	if (tag < tags->nr_tags && blk_mq_test_tag_bit(tags, tag)) {
  		prefetch(tags->rqs[tag]);
  		return tags->rqs[tag];
  	}

Please do not slow down the hot path by inserting additional code in the
hot path. I am convinced that the race described in the patch
description can be fixed without changing the hot path. See also the
conversation I had recently with John Garry on linux-block.

Seems to be cropping up everywhere now; anyway, I do agree with Bart here.
For the hot path (typically when looking up the associated command from within the interrupt routine) we really should not add any further code to not slow down processing. Additionally, this is typically a firmware response so we can be reasonably certain that this is a response to valid command, so in nearly all cases the bit will be set.
(Pathological cases like spoofed response frames aside).

However, there another use case where blk_mq_tag_to_rq() is used, and that is for traversing outstanding commands eg during a device reset.
There we _have_ to ensure that the request is valid lest we run into
uninitialized values.

So I would advocate to have a slow path variant here which would validate the bitmap before trying to access the request.

Or, really, converting those drivers to use blk_mq_tagset_busy_iter().

Cheers,

Hannes
--
Dr. Hannes Reinecke                Kernel Storage Architect
hare@xxxxxxx                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux