On Sun, Nov 05, 2017 at 08:10:08PM +0800, Ming Lei wrote: > blk-mq never respects queue dead, and this may cause use-after-free on > any kind of queue resources. This patch respects the rule by calling > blk_mq_quiesce_queue() when queue is marked as DEAD. > > This patch fixes the following kernel crash: > > [ 42.170824] BUG: unable to handle kernel NULL pointer dereference at (null) > [ 42.172011] IP: blk_mq_flush_busy_ctxs+0x5a/0xe0 > [ 42.172011] PGD 25d8d1067 P4D 25d8d1067 PUD 25d458067 PMD 0 > [ 42.172011] Oops: 0000 [#1] PREEMPT SMP > [ 42.172011] Dumping ftrace buffer: > [ 42.172011] (ftrace buffer empty) > [ 42.172011] Modules linked in: scsi_debug ebtable_filter ebtables ip6table_filter ip6_tables xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c bridge stp llc fuse iptable_filter ip_tables sd_mod sg mptsas mptscsih mptbase crc32c_intel scsi_transport_sas nvme lpc_ich serio_raw ahci virtio_scsi libahci libata nvme_core binfmt_misc dm_mod iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi null_blk configs > [ 42.172011] CPU: 0 PID: 2750 Comm: fio Not tainted 4.14.0-rc7.blk_mq_io_hang+ #507 > [ 42.172011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.9.3-1.fc25 04/01/2014 > [ 42.172011] task: ffff88025dc18000 task.stack: ffffc900028c4000 > [ 42.172011] RIP: 0010:blk_mq_flush_busy_ctxs+0x5a/0xe0 > [ 42.172011] RSP: 0018:ffffc900028c7be0 EFLAGS: 00010246 > [ 42.172011] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8802775907c8 > [ 42.172011] RDX: 0000000000000000 RSI: ffffc900028c7c50 RDI: ffff88025de25c00 > [ 42.172011] RBP: ffffc900028c7c28 R08: 0000000000000000 R09: ffff8802595692c0 > [ 42.172011] R10: ffffc900028c7e50 R11: 0000000000000008 R12: ffffc900028c7c50 > [ 42.172011] R13: ffff88025de25cd8 R14: ffffc900028c7d78 R15: ffff88025de25c00 > [ 42.172011] FS: 00007faa8653f7c0(0000) GS:ffff88027fc00000(0000) knlGS:0000000000000000 > [ 42.172011] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 42.172011] CR2: 0000000000000000 CR3: 00000002593df006 CR4: 00000000003606f0 > [ 42.172011] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 42.172011] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 42.172011] Call Trace: > [ 42.172011] blk_mq_sched_dispatch_requests+0x1b4/0x1f0 > [ 42.172011] ? preempt_schedule+0x27/0x30 > [ 42.172011] __blk_mq_run_hw_queue+0x8b/0xa0 > [ 42.172011] __blk_mq_delay_run_hw_queue+0xb7/0x100 > [ 42.172011] blk_mq_run_hw_queue+0x14/0x20 > [ 42.172011] blk_mq_sched_insert_requests+0xda/0x120 > [ 42.172011] blk_mq_flush_plug_list+0x179/0x280 > [ 42.172011] blk_flush_plug_list+0x102/0x290 > [ 42.172011] blk_finish_plug+0x2c/0x40 > [ 42.172011] do_io_submit+0x3fa/0x780 > [ 42.172011] SyS_io_submit+0x10/0x20 > [ 42.172011] ? SyS_io_submit+0x10/0x20 > [ 42.172011] entry_SYSCALL_64_fastpath+0x1a/0xa5 > [ 42.172011] RIP: 0033:0x7faa80ff8697 > [ 42.172011] RSP: 002b:00007ffe08236de8 EFLAGS: 00000206 ORIG_RAX: 00000000000000d1 > [ 42.172011] RAX: ffffffffffffffda RBX: 0000000001f1e600 RCX: 00007faa80ff8697 > [ 42.172011] RDX: 000000000208e638 RSI: 0000000000000001 RDI: 00007faa6ecae000 > [ 42.172011] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000001f1e0e0 > [ 42.172011] R10: 0000000000000001 R11: 0000000000000206 R12: 00007faa614f19e0 > [ 42.172011] R13: 00007faa614ff0b0 R14: 00007faa614fef50 R15: 0000000100000000 > [ 42.172011] Code: e0 00 00 00 c7 45 bc 00 00 00 00 85 c0 74 76 4c 8d af d8 00 00 00 49 89 ff 44 8b 45 bc 4c 89 c0 49 c1 e0 06 4d 03 87 e8 00 00 00 <49> 83 38 00 4d 89 c6 74 41 41 8b 8f dc 00 00 00 41 89 c4 31 db > [ 42.172011] RIP: blk_mq_flush_busy_ctxs+0x5a/0xe0 RSP: ffffc900028c7be0 > [ 42.172011] CR2: 0000000000000000 > [ 42.172011] ---[ end trace 2432dfddf9b84061 ]--- > [ 42.172011] Kernel panic - not syncing: Fatal exception > [ 42.172011] Dumping ftrace buffer: > [ 42.172011] (ftrace buffer empty) > [ 42.172011] Kernel Offset: disabled > [ 42.172011] ---[ end Kernel panic - not syncing: Fatal exception > > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > --- > block/blk-core.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/block/blk-core.c b/block/blk-core.c > index 048be4aa6024..0b121f29e3b1 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -658,6 +658,10 @@ void blk_cleanup_queue(struct request_queue *q) > queue_flag_set(QUEUE_FLAG_DEAD, q); > spin_unlock_irq(lock); > > + /* respect queue DEAD via quiesce for blk-mq */ > + if (q->mq_ops) > + blk_mq_quiesce_queue(q); > + > /* for synchronous bio-based driver finish in-flight integrity i/o */ > blk_flush_integrity(); Hi Jens, Could you consider this patch? This issue can be reproduced easily in heavy IO and device remove test on SCSI. -- Ming