Re: [PATCH] sched: Prevent rescheduling when interrupts are disabled

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 16 December 2024 13:20:56 GMT, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>David reported a warning observed while loop testing kexec jump:
>
>  Interrupts enabled after irqrouter_resume+0x0/0x50
>  WARNING: CPU: 0 PID: 560 at drivers/base/syscore.c:103 syscore_resume+0x18a/0x220
>   kernel_kexec+0xf6/0x180
>   __do_sys_reboot+0x206/0x250
>   do_syscall_64+0x95/0x180
>
>The corresponding interrupt flag trace:
>
>  hardirqs last  enabled at (15573): [<ffffffffa8281b8e>] __up_console_sem+0x7e/0x90
>  hardirqs last disabled at (15580): [<ffffffffa8281b73>] __up_console_sem+0x63/0x90
>
>That means __up_console_sem() was invoked with interrupts enabled. Further
>instrumentation revealed that in the interrupt disabled section of kexec
>jump one of the syscore_suspend() callbacks woke up a task, which set the
>NEED_RESCHED flag. A later callback in the resume path invoked
>cond_resched() which in turn led to the invocation of the scheduler:
>
>  __cond_resched+0x21/0x60
>  down_timeout+0x18/0x60
>  acpi_os_wait_semaphore+0x4c/0x80
>  acpi_ut_acquire_mutex+0x3d/0x100
>  acpi_ns_get_node+0x27/0x60
>  acpi_ns_evaluate+0x1cb/0x2d0
>  acpi_rs_set_srs_method_data+0x156/0x190
>  acpi_pci_link_set+0x11c/0x290
>  irqrouter_resume+0x54/0x60
>  syscore_resume+0x6a/0x200
>  kernel_kexec+0x145/0x1c0
>  __do_sys_reboot+0xeb/0x240
>  do_syscall_64+0x95/0x180
>
>This is a long standing problem, which probably got more visible with
>the recent printk changes. Something does a task wakeup and the
>scheduler sets the NEED_RESCHED flag. cond_resched() sees it set and
>invokes schedule() from a completely bogus context. The scheduler
>enables interrupts after context switching, which causes the above
>warning at the end.
>
>Quite some of the code paths in syscore_suspend()/resume() can result in
>triggering a wakeup with the exactly same consequences. They might not
>have done so yet, but as they share a lot of code with normal operations
>it's just a question of time.
>
>The problem only affects the PREEMPT_NONE and PREEMPT_VOLUNTARY scheduling
>models. Full preemption is not affected as cond_resched() is disabled and
>the preemption check preemptible() takes the interrupt disabled flag into
>account.
>
>Cure the problem by adding a corresponding check into cond_resched().
>
>Reported-by: David Woodhouse <dwmw2@xxxxxxxxxxxxx>
>Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>Tested-by: David Woodhouse <dwmw2@xxxxxxxxxxxxx>
>Cc: stable@xxxxxxxxxxxxxxx
>Closes: https://lore.kernel.org/all/7717fe2ac0ce5f0a2c43fdab8b11f4483d54a2a4.camel@xxxxxxxxxxxxx
>---
> kernel/sched/core.c |    2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>--- a/kernel/sched/core.c
>+++ b/kernel/sched/core.c
>@@ -7276,7 +7276,7 @@ void rt_mutex_setprio(struct task_struct
> #if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC)
> int __sched __cond_resched(void)
> {
>-	if (should_resched(0)) {
>+	if (should_resched(0) && !irqs_disabled()) {
> 		preempt_schedule_common();
> 		return 1;
> 	}

Thank you. Slight preference for dwmw@xxxxxxxxxxxx as the Reported-by and Tested-by addresses if it's not too late. I'm assuming you will handle this and don't want me to round it up with the kexec bits.





[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux