Hi Ming, There seems to be a sleeping while atomic bug in hd_struct_free(): 288 static void hd_struct_free(struct percpu_ref *ref) 289 { 290 struct hd_struct *part = container_of(ref, struct hd_struct, ref); 291 struct gendisk *disk = part_to_disk(part); 292 struct disk_part_tbl *ptbl = 293 rcu_dereference_protected(disk->part_tbl, 1); 294 295 rcu_assign_pointer(ptbl->last_lookup, NULL); 296 put_device(disk_to_dev(disk)); 297 298 INIT_RCU_WORK(&part->rcu_work, hd_struct_free_work); 299 queue_rcu_work(system_wq, &part->rcu_work); 300 } hd_struct_free() is a percpu_ref release callback and must not sleep. But put_device() can end up in disk_release(), resulting in anything from "sleeping function called from invalid context" splats to actual lockups if the queue ends up being released: BUG: scheduling while atomic: ksoftirqd/3/26/0x00000102 INFO: lockdep is turned off. CPU: 3 PID: 26 Comm: ksoftirqd/3 Tainted: G W 5.9.0-rc2-ceph-g2de49bea2ebc #1 Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 2.0 12/17/2015 Call Trace: dump_stack+0x96/0xd0 __schedule_bug.cold+0x64/0x71 __schedule+0x8ea/0xac0 ? wait_for_completion+0x86/0x110 schedule+0x5f/0xd0 schedule_timeout+0x212/0x2a0 ? wait_for_completion+0x86/0x110 wait_for_completion+0xb0/0x110 __wait_rcu_gp+0x139/0x150 synchronize_rcu+0x79/0xf0 ? invoke_rcu_core+0xb0/0xb0 ? rcu_read_lock_any_held+0xb0/0xb0 blk_free_flush_queue+0x17/0x30 blk_mq_hw_sysfs_release+0x32/0x70 kobject_put+0x7d/0x1d0 blk_mq_release+0xbe/0xf0 blk_release_queue+0xb7/0x140 kobject_put+0x7d/0x1d0 disk_release+0xb0/0xc0 device_release+0x25/0x80 kobject_put+0x7d/0x1d0 hd_struct_free+0x4c/0xc0 percpu_ref_switch_to_atomic_rcu+0x1df/0x1f0 rcu_core+0x3fd/0x660 ? rcu_core+0x3cc/0x660 __do_softirq+0xd5/0x45e ? smpboot_thread_fn+0x26/0x1d0 run_ksoftirqd+0x30/0x60 smpboot_thread_fn+0xfe/0x1d0 ? sort_range+0x20/0x20 kthread+0x11a/0x150 ? kthread_delayed_work_timer_fn+0xa0/0xa0 ret_from_fork+0x1f/0x30 "git blame" points at your commit tb7d6c3033323 ("block: fix use-after-free on cached last_lookup partition"), but there is likely more to it because it went into 5.8 and I haven't seen these lockups until we rebased to 5.9-rc. Could you please take a look? Thanks, Ilya