On Mon, Sep 14, 2020 at 10:14 PM Pradeep P V K <ppvk@xxxxxxxxxxxxxx> wrote: > > Observes below crash while accessing (use-after-free) lock member > of bfq data. > > context#1 context#2 > process_one_work() > kthread() blk_mq_run_work_fn() > worker_thread() ->__blk_mq_run_hw_queue() > process_one_work() ->blk_mq_sched_dispatch_requests() > __blk_release_queue() ->blk_mq_do_dispatch_sched() > ->__elevator_exit() > ->blk_mq_exit_sched() > ->exit_sched() > ->kfree() > ->bfq_dispatch_request() > ->spin_unlock_irq(&bfqd->lock) > > This is because of the kblockd delayed work that might got scheduled > around blk_release_queue() and accessed use-after-free member of > bfq_data. > > 240.212359: <2> Unable to handle kernel paging request at > virtual address ffffffee2e33ad70 > ... > 240.212637: <2> Workqueue: kblockd blk_mq_run_work_fn > 240.212649: <2> pstate: 00c00085 (nzcv daIf +PAN +UAO) > 240.212666: <2> pc : queued_spin_lock_slowpath+0x10c/0x2e0 > 240.212677: <2> lr : queued_spin_lock_slowpath+0x84/0x2e0 > ... > Call trace: > 240.212865: <2> queued_spin_lock_slowpath+0x10c/0x2e0 > 240.212876: <2> do_raw_spin_lock+0xf0/0xf4 > 240.212890: <2> _raw_spin_lock_irq+0x74/0x94 > 240.212906: <2> bfq_dispatch_request+0x4c/0xd60 > 240.212918: <2> blk_mq_do_dispatch_sched+0xe0/0x1f0 > 240.212927: <2> blk_mq_sched_dispatch_requests+0x130/0x194 > 240.212940: <2> __blk_mq_run_hw_queue+0x100/0x158 > 240.212950: <2> blk_mq_run_work_fn+0x1c/0x28 > 240.212963: <2> process_one_work+0x280/0x460 > 240.212973: <2> worker_thread+0x27c/0x4dc > 240.212986: <2> kthread+0x160/0x170 > > Fix this by cancelling the delayed work if any before elevator exits. > > Signed-off-by: Pradeep P V K <ppvk@xxxxxxxxxxxxxx> > --- > block/blk-sysfs.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c > index 81722cd..e4a9aac 100644 > --- a/block/blk-sysfs.c > +++ b/block/blk-sysfs.c > @@ -779,6 +779,8 @@ static void blk_release_queue(struct kobject *kobj) > { > struct request_queue *q = > container_of(kobj, struct request_queue, kobj); > + struct blk_mq_hw_ctx *hctx; > + int i; Please move the above two lines to the branch of 'if (queue_is_mq(q)) '. > > might_sleep(); > > @@ -788,9 +790,11 @@ static void blk_release_queue(struct kobject *kobj) > > blk_free_queue_stats(q->stats); > > - if (queue_is_mq(q)) > + if (queue_is_mq(q)) { > cancel_delayed_work_sync(&q->requeue_work); > - > + queue_for_each_hw_ctx(q, hctx, i) > + cancel_delayed_work_sync(&hctx->run_work); > + } Now the 'cancel_delayed_work_sync' from blk_mq_hw_sysfs_release() can be killed meantime. We have to call cancel_delayed_work_sync when reqeuest queue's refcount drops to zero, otherwise new run queue may be started somewhere, so feel free to add the reviewed-by after the above two comments are addressed. Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx> Thanks, Ming Lei