Re: [PATCH V1] block: Fix use-after-free issue while accessing ioscheduler lock

ppvk@xxxxxxxxxxxxxx · Tue, 15 Sep 2020 14:36:25 +0530

On 2020-09-15 08:42, Ming Lei wrote:
On Mon, Sep 14, 2020 at 07:42:39PM +0530, Pradeep P V K wrote:
Observes below crash while accessing (use-after-free) lock member
of bfq data.

context#1			context#2
				process_one_work()
kthread()			blk_mq_run_work_fn()
worker_thread()			 ->__blk_mq_run_hw_queue()
process_one_work()		  ->blk_mq_sched_dispatch_requests()
__blk_release_queue()		    ->blk_mq_do_dispatch_sched()
->__elevator_exit()
  ->blk_mq_exit_sched()
    ->exit_sched()
      ->kfree()
				       ->bfq_dispatch_request()
				         ->spin_unlock_irq(&bfqd->lock)

This is because of the kblockd delayed work that might got scheduled
around blk_release_queue() and accessed use-after-free member of
bfq_data.

240.212359:   <2> Unable to handle kernel paging request at
virtual address ffffffee2e33ad70
...
240.212637:   <2> Workqueue: kblockd blk_mq_run_work_fn
240.212649:   <2> pstate: 00c00085 (nzcv daIf +PAN +UAO)
240.212666:   <2> pc : queued_spin_lock_slowpath+0x10c/0x2e0
240.212677:   <2> lr : queued_spin_lock_slowpath+0x84/0x2e0
...
Call trace:
240.212865:   <2>  queued_spin_lock_slowpath+0x10c/0x2e0
240.212876:   <2>  do_raw_spin_lock+0xf0/0xf4
240.212890:   <2>  _raw_spin_lock_irq+0x74/0x94
240.212906:   <2>  bfq_dispatch_request+0x4c/0xd60
240.212918:   <2>  blk_mq_do_dispatch_sched+0xe0/0x1f0
240.212927:   <2>  blk_mq_sched_dispatch_requests+0x130/0x194
240.212940:   <2>  __blk_mq_run_hw_queue+0x100/0x158
240.212950:   <2>  blk_mq_run_work_fn+0x1c/0x28
240.212963:   <2>  process_one_work+0x280/0x460
240.212973:   <2>  worker_thread+0x27c/0x4dc
240.212986:   <2>  kthread+0x160/0x170

Fix this by cancelling the delayed work if any before elevator exits.

Signed-off-by: Pradeep P V K <ppvk@xxxxxxxxxxxxxx>
---
 block/blk-sysfs.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 81722cd..e4a9aac 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -779,6 +779,8 @@ static void blk_release_queue(struct kobject 
*kobj)
 {
 	struct request_queue *q =
 		container_of(kobj, struct request_queue, kobj);
+	struct blk_mq_hw_ctx *hctx;
+	int i;

 	might_sleep();

@@ -788,9 +790,11 @@ static void blk_release_queue(struct kobject 
*kobj)

 	blk_free_queue_stats(q->stats);

-	if (queue_is_mq(q))
+	if (queue_is_mq(q)) {
 		cancel_delayed_work_sync(&q->requeue_work);
-
+		queue_for_each_hw_ctx(q, hctx, i)
+			cancel_delayed_work_sync(&hctx->run_work);
+	}

hw queue may be run synchronously, such as from flush plug context.
However we have grabbed one usage ref for that.

So looks this approach is fine, but just wondering why not putting
the above change into blk_sync_queue() or blk_cleanup_queue() directly?

Thanks Ming for the review and comments.
I added it in blk_release_queue(), as i could see a similar delayed work
"requeue_work" is getting cancelled. So i put here.
blk_release_queue() will gets called from blk_cleanup_queue(), so we can
add it directly here. i will make this change in my next patch set.


Thanks,
Ming


Thanks and Regards,
Pradeep