Hi guys, I encountered a hang issue on a s390x system. The tested kernel is not preemptible and booting with "nr_cpus=1" The test steps: umount /home lvremove /dev/rhel_s390x-kvm-011/home ## uncomment "snapshot_autoextend_threshold = 70" and "snapshot_autoextend_percent = 20" in /etc/lvm/lvm.conf systemctl enable lvm2-monitor.service systemctl start lvm2-monitor.service lvremove -y rhel_s390x-kvm-011/thinp lvcreate -L 10M -T rhel_s390x-kvm-011/thinp lvcreate -V 400M -T rhel_s390x-kvm-011/thinp -n src mkfs.ext4 /dev/rhel_s390x-kvm-011/src mount /dev/rhel_s390x-kvm-011/src /mnt for((i=0;i<4;i++)); do dd if=/dev/zero of=/mnt/test$i.img bs=100M count=1; done And the system hangs with the console log [1] The related kernel config CONFIG_PREEMPT_NONE_BUILD=y CONFIG_PREEMPT_NONE=y CONFIG_PREEMPT_COUNT=y CONFIG_SCHED_CORE=y It turns out that when hanging, the kernel is stuck in the dead-loop in the function dm_wq_work() while (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) { spin_lock_irq(&md->deferred_lock); bio = bio_list_pop(&md->deferred); spin_unlock_irq(&md->deferred_lock); if (!bio) break; thread_cpu = smp_processor_id(); submit_bio_noacct(bio); } where dm_wq_work()->__submit_bio_noacct()->...->dm_handle_requeue() keeps generating new bio, and the condition "if (!bio)" can not be meet. After applying the following patch, the issue is gone. diff --git a/drivers/md/dm.c b/drivers/md/dm.c index e1ea3a7bd9d9..95c9cb07a42f 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2567,6 +2567,7 @@ static void dm_wq_work(struct work_struct *work) break; submit_bio_noacct(bio); + cond_resched(); } } But I think it is not a proper solution. And without this patch, if removing nr_cpus=1 (the system has two cpus), the issue can not be triggered. That says when more than one cpu, the above loop can exit by the condition "if (!bio)" Any ideas? Thanks, Pingfan [1]: the console log when hanging [ 2062.321473] device-mapper: thin: 253:4: reached low water mark for data device: sending event. [ 2062.353217] dm-3: detected capacity change from 122880 to 147456 [-- MARK -- Wed Dec 14 08:45:00 2022] [ 2062.376690] device-mapper: thin: 253:4: switching pool to out-of-data-space (queue IO) mode [ 2122.393998] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 2122.394011] (detected by 0, t=6002 jiffies, g=36205, q=624 ncpus=1) [ 2122.394014] rcu: All QSes seen, last rcu_sched kthread activity 6002 (4295149593-4295143591), jiffies_till_next_fqs=1, root ->qsmask 0x0 [ 2122.394017] rcu: rcu_sched kthread starved for 6002 jiffies! g36205 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0 [ 2122.394019] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior. [ 2122.394020] rcu: RCU grace-period kthread stack dump: [ 2122.394022] task:rcu_sched state:R running task stack: 0 pid: 15 ppid: 2 flags:0x00000000 [ 2122.394027] Call Trace: [ 2122.394030] [<00000001001fd5b0>] __schedule+0x300/0x6c0 [ 2122.394040] [<00000001001fd9d2>] schedule+0x62/0xf0 [ 2122.394042] [<000000010020360c>] schedule_timeout+0x8c/0x170 [ 2122.394045] [<00000000ff9e219e>] rcu_gp_fqs_loop+0x30e/0x3d0 [ 2122.394053] [<00000000ff9e3ee2>] rcu_gp_kthread+0x132/0x1b0 [ 2122.394054] [<00000000ff980578>] kthread+0x108/0x110 [ 2122.394058] [<00000000ff9035cc>] __ret_from_fork+0x3c/0x60 [ 2122.394061] [<000000010020455a>] ret_from_fork+0xa/0x30 [ 2122.394064] rcu: Stack dump where RCU GP kthread last ran: [ 2122.394064] Task dump for CPU 0: [ 2122.394066] task:kworker/0:2 state:R running task stack: 0 pid:16943 ppid: 2 flags:0x00000004 [ 2122.394070] Workqueue: kdmflush/253:6 dm_wq_work [dm_mod] [ 2122.394100] Call Trace: [ 2122.394100] [<00000000ff98bb98>] sched_show_task.part.0+0xf8/0x130 [ 2122.394106] [<00000001001efc9a>] rcu_check_gp_kthread_starvation+0x172/0x190 [ 2122.394111] [<00000000ff9e5afe>] print_other_cpu_stall+0x2de/0x370 [ 2122.394113] [<00000000ff9e5d60>] check_cpu_stall+0x1d0/0x270 [ 2122.394114] [<00000000ff9e6152>] rcu_sched_clock_irq+0x82/0x230 [ 2122.394117] [<00000000ff9f808a>] update_process_times+0xba/0xf0 [ 2122.394121] [<00000000ffa0adfa>] tick_sched_handle+0x4a/0x70 [ 2122.394124] [<00000000ffa0b2fe>] tick_sched_timer+0x5e/0xc0 [ 2122.394126] [<00000000ff9f8cb6>] __hrtimer_run_queues+0x136/0x290 [ 2122.394128] [<00000000ff9f9f80>] hrtimer_interrupt+0x150/0x2d0 [ 2122.394130] [<00000000ff90ce36>] do_IRQ+0x56/0x70 [ 2122.394133] [<00000000ff90d216>] do_irq_async+0x56/0xb0 [ 2122.394135] [<00000001001f6786>] do_ext_irq+0x96/0x160 [ 2122.394138] [<00000001002047bc>] ext_int_handler+0xdc/0x110 [ 2122.394140] [<000003ff8005cf48>] dm_split_and_process_bio+0x28/0x4d0 [dm_mod] [ 2122.394152] ([<000003ff8005d2ec>] dm_split_and_process_bio+0x3cc/0x4d0 [dm_mod]) [ 2122.394162] [<000003ff8005dca8>] dm_submit_bio+0x68/0x110 [dm_mod] [ 2122.394173] [<00000000ffda0698>] __submit_bio+0x78/0x190 [ 2122.394178] [<00000000ffda081c>] __submit_bio_noacct+0x6c/0x1e0 [ 2122.394180] [<000003ff8005c2ac>] dm_wq_work+0x5c/0xc0 [dm_mod] [ 2122.394190] [<00000000ff9771e6>] process_one_work+0x216/0x4a0 [ 2122.394196] [<00000000ff9779a4>] worker_thread+0x64/0x4a0 [ 2122.394198] [<00000000ff980578>] kthread+0x108/0x110 [ 2122.394199] [<00000000ff9035cc>] __ret_from_fork+0x3c/0x60 [ 2122.394201] [<000000010020455a>] ret_from_fork+0xa/0x30 [ 2302.444001] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 2302.444015] (detected by 0, t=24007 jiffies, g=36205, q=624 ncpus=1) [ 2302.444018] rcu: All QSes seen, last rcu_sched kthread activity 24007 (4295167598-4295143591), jiffies_till_next_fqs=1, root ->qsmask 0x0 [ 2302.444021] rcu: rcu_sched kthread starved for 24007 jiffies! g36205 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0 [ 2302.444024] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior. [ 2302.444025] rcu: RCU grace-period kthread stack dump: [ 2302.444027] task:rcu_sched state:R running task stack: 0 pid: 15 ppid: 2 flag[-- MARK -- Wed Dec 14 08:50:00 2022] s:0x00000000 [ 2302.444204] Call Trace: [ 2302.444207] [<00000001001fd5b0>] __schedule+0x300/0x6c0 [ 2302.444216] [<00000001001fd9d2>] schedule+0x62/0xf0 [ 2302.444218] [<000000010020360c>] schedule_timeout+0x8c/0x170 [ 2302.444221] [<00000000ff9e219e>] rcu_gp_fqs_loop+0x30e/0x3d0 [ 2302.444227] [<00000000ff9e3ee2>] rcu_gp_kthread+0x132/0x1b0 [ 2302.444229] [<00000000ff980578>] kthread+0x108/0x110 [ 2302.444232] [<00000000ff9035cc>] __ret_from_fork+0x3c/0x60 [ 2302.444235] [<000000010020455a>] ret_from_fork+0xa/0x30 [ 2302.444237] rcu: Stack dump where RCU GP kthread last ran: [ 2302.444238] Task dump for CPU 0: [ 2302.444239] task:kworker/0:2 state:R running task stack: 0 pid:16943 ppid: 2 flags:0x00000004 [ 2302.444244] Workqueue: kdmflush/253:6 dm_wq_work [dm_mod] [ 2302.444270] Call Trace: [ 2302.444270] [<00000000ff98bb98>] sched_show_task.part.0+0xf8/0x130 [ 2302.444275] [<00000001001efc9a>] rcu_check_gp_kthread_starvation+0x172/0x190 [ 2302.444280] [<00000000ff9e5afe>] print_other_cpu_stall+0x2de/0x370 [ 2302.444282] [<00000000ff9e5d60>] check_cpu_stall+0x1d0/0x270 [ 2302.444284] [<00000000ff9e6152>] rcu_sched_clock_irq+0x82/0x230 [ 2302.444286] [<00000000ff9f808a>] update_process_times+0xba/0xf0 [ 2302.444290] [<00000000ffa0adfa>] tick_sched_handle+0x4a/0x70 [ 2302.444292] [<00000000ffa0b2fe>] tick_sched_timer+0x5e/0xc0 [ 2302.444294] [<00000000ff9f8cb6>] __hrtimer_run_queues+0x136/0x290 [ 2302.444296] [<00000000ff9f9f80>] hrtimer_interrupt+0x150/0x2d0 [ 2302.444298] [<00000000ff90ce36>] do_IRQ+0x56/0x70 [ 2302.444301] [<00000000ff90d216>] do_irq_async+0x56/0xb0 [ 2302.444303] [<00000001001f6786>] do_ext_irq+0x96/0x160 [ 2302.444306] [<00000001002047bc>] ext_int_handler+0xdc/0x110 [ 2302.444308] [<00000000ffbe2a1e>] kmem_cache_alloc+0x15e/0x530 [ 2302.444316] [<00000000ffb3e780>] mempool_alloc+0x60/0x210 [ 2302.444319] [<00000000ffd9ac4e>] bio_alloc_bioset+0x1ae/0x410 [ 2302.444324] [<00000000ffd9af8c>] bio_alloc_clone+0x3c/0x90 [ 2302.444326] [<000003ff8005cf9a>] dm_split_and_process_bio+0x7a/0x4d0 [dm_mod] [ 2302.444337] [<000003ff8005dca8>] dm_submit_bio+0x68/0x110 [dm_mod] [ 2302.444347] [<00000000ffda0698>] __submit_bio+0x78/0x190 [ 2302.444350] [<00000000ffda081c>] __submit_bio_noacct+0x6c/0x1e0 [ 2302.444353] [<000003ff8005c2ac>] dm_wq_work+0x5c/0xc0 [dm_mod] [ 2302.444363] [<00000000ff9771e6>] process_one_work+0x216/0x4a0 [ 2302.444367] [<00000000ff9779a4>] worker_thread+0x64/0x4a0 [ 2302.444369] [<00000000ff980578>] kthread+0x108/0x110 [ 2302.444371] [<00000000ff9035cc>] __ret_from_fork+0x3c/0x60 [ 2302.444373] [<000000010020455a>] ret_from_fork+0xa/0x30 -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel