Hi On 29.09.2021 17:17, Peter Zijlstra wrote: > Simplify and make wake_up_if_idle() more robust, also don't iterate > the whole machine with preempt_disable() in it's caller: > wake_up_all_idle_cpus(). > > This prepares for another wake_up_if_idle() user that needs a full > do_idle() cycle. > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> This patch landed recently in linux-next as commit 8850cb663b5c ("sched: Simplify wake_up_*idle*()"). It causes the following warning on the arm64 virt machine under qemu during the system suspend/resume cycle: --->8--- printk: Suspending console(s) (use no_console_suspend to debug) ============================================ WARNING: possible recursive locking detected 5.15.0-rc6-next-20211022 #10905 Not tainted -------------------------------------------- rtcwake/1326 is trying to acquire lock: ffffd4e9192e8130 (cpu_hotplug_lock){++++}-{0:0}, at: wake_up_all_idle_cpus+0x24/0x98 but task is already holding lock: ffffd4e9192e8130 (cpu_hotplug_lock){++++}-{0:0}, at: suspend_devices_and_enter+0x740/0x9f0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(cpu_hotplug_lock); lock(cpu_hotplug_lock); *** DEADLOCK *** May be due to missing lock nesting notation 5 locks held by rtcwake/1326: #0: ffff54ad86a78438 (sb_writers#7){.+.+}-{0:0}, at: ksys_write+0x64/0xf0 #1: ffff54ad84094a88 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0xf4/0x1a8 #2: ffff54ad83b17a88 (kn->active#43){.+.+}-{0:0}, at: kernfs_fop_write_iter+0xfc/0x1a8 #3: ffffd4e9192efab0 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0x214/0x3d0 #4: ffffd4e9192e8130 (cpu_hotplug_lock){++++}-{0:0}, at: suspend_devices_and_enter+0x740/0x9f0 stack backtrace: CPU: 0 PID: 1326 Comm: rtcwake Not tainted 5.15.0-rc6-next-20211022 #10905 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x1d0 show_stack+0x14/0x20 dump_stack_lvl+0x88/0xb0 dump_stack+0x14/0x2c __lock_acquire+0x171c/0x17b8 lock_acquire+0x234/0x378 cpus_read_lock+0x5c/0x150 wake_up_all_idle_cpus+0x24/0x98 suspend_devices_and_enter+0x748/0x9f0 pm_suspend+0x2b0/0x3d0 state_store+0x84/0x108 kobj_attr_store+0x14/0x28 sysfs_kf_write+0x60/0x70 kernfs_fop_write_iter+0x124/0x1a8 new_sync_write+0xe8/0x1b0 vfs_write+0x1d0/0x408 ksys_write+0x64/0xf0 __arm64_sys_write+0x14/0x20 invoke_syscall+0x40/0xf8 el0_svc_common.constprop.3+0x8c/0x120 do_el0_svc_compat+0x18/0x48 el0_svc_compat+0x48/0x100 el0t_32_sync_handler+0xec/0x140 el0t_32_sync+0x170/0x174 OOM killer enabled. Restarting tasks ... done. PM: suspend exit --->8--- Let me know if there is anything I can help to debug and fix this issue. > --- > kernel/sched/core.c | 14 +++++--------- > kernel/smp.c | 6 +++--- > 2 files changed, 8 insertions(+), 12 deletions(-) > > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -3691,15 +3691,11 @@ void wake_up_if_idle(int cpu) > if (!is_idle_task(rcu_dereference(rq->curr))) > goto out; > > - if (set_nr_if_polling(rq->idle)) { > - trace_sched_wake_idle_without_ipi(cpu); > - } else { > - rq_lock_irqsave(rq, &rf); > - if (is_idle_task(rq->curr)) > - smp_send_reschedule(cpu); > - /* Else CPU is not idle, do nothing here: */ > - rq_unlock_irqrestore(rq, &rf); > - } > + rq_lock_irqsave(rq, &rf); > + if (is_idle_task(rq->curr)) > + resched_curr(rq); > + /* Else CPU is not idle, do nothing here: */ > + rq_unlock_irqrestore(rq, &rf); > > out: > rcu_read_unlock(); > --- a/kernel/smp.c > +++ b/kernel/smp.c > @@ -1170,14 +1170,14 @@ void wake_up_all_idle_cpus(void) > { > int cpu; > > - preempt_disable(); > + cpus_read_lock(); > for_each_online_cpu(cpu) { > - if (cpu == smp_processor_id()) > + if (cpu == raw_smp_processor_id()) > continue; > > wake_up_if_idle(cpu); > } > - preempt_enable(); > + cpus_read_unlock(); > } > EXPORT_SYMBOL_GPL(wake_up_all_idle_cpus); > > > > Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland