My 64 core box just passed an hour running Steven's hotplug stress script along with stockfish and futextests (tip-rt.today w. hotplug hacks you saw a while back), and seems content to just keep on grinding away. Without it, box quickly becomes a doorstop. [ 634.896901] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:931 [ 634.896902] in_atomic(): 1, irqs_disabled(): 1, pid: 104, name: migration/6 [ 634.896902] no locks held by migration/6/104. [ 634.896903] irq event stamp: 1208518 [ 634.896907] hardirqs last enabled at (1208517): [<ffffffff816de46c>] _raw_spin_unlock_irqrestore+0x8c/0xa0 [ 634.896910] hardirqs last disabled at (1208518): [<ffffffff81146055>] multi_cpu_stop+0xc5/0x110 [ 634.896912] softirqs last enabled at (0): [<ffffffff81075dd2>] copy_process.part.32+0x672/0x1fc0 [ 634.896913] softirqs last disabled at (0): [< (null)>] (null) [ 634.896914] Preemption disabled at:[<ffffffff8114629c>] cpu_stopper_thread+0x8c/0x120 [ 634.896914] [ 634.896915] CPU: 6 PID: 104 Comm: migration/6 Tainted: G E 4.8.2-rt1-rt_debug #23 [ 634.896916] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013 [ 634.896918] 0000000000000000 ffff880176fb3c40 ffffffff8139c04d 0000000000000000 [ 634.896919] ffff880176fa8000 ffff880176fb3c68 ffffffff810a8102 ffffffff81c29cc0 [ 634.896919] ffff8803fc825640 ffff8803fc825640 ffff880176fb3c88 ffffffff816de754 [ 634.896920] Call Trace: [ 634.896923] [<ffffffff8139c04d>] dump_stack+0x85/0xc8 [ 634.896924] [<ffffffff810a8102>] ___might_sleep+0x152/0x250 [ 634.896926] [<ffffffff816de754>] rt_spin_lock+0x24/0x80 [ 634.896928] [<ffffffff810d67f9>] ? __lock_is_held+0x49/0x70 [ 634.896929] [<ffffffff810623ee>] pgd_free+0x1e/0xb0 [ 634.896930] [<ffffffff81074877>] __mmdrop+0x27/0xd0 [ 634.896932] [<ffffffff810b4a0d>] sched_cpu_dying+0x24d/0x2c0 [ 634.896933] [<ffffffff810b47c0>] ? sched_cpu_starting+0x60/0x60 [ 634.896934] [<ffffffff81079864>] cpuhp_invoke_callback+0xd4/0x350 [ 634.896935] [<ffffffff81079e56>] take_cpu_down+0x86/0xd0 [ 634.896936] [<ffffffff81146060>] multi_cpu_stop+0xd0/0x110 [ 634.896937] [<ffffffff81145f90>] ? cpu_stop_queue_work+0x90/0x90 [ 634.896938] [<ffffffff811462a2>] cpu_stopper_thread+0x92/0x120 [ 634.896940] [<ffffffff810a50fe>] smpboot_thread_fn+0x1de/0x360 [ 634.896941] [<ffffffff810a4f20>] ? smpboot_update_cpumask_percpu_thread+0x130/0x130 [ 634.896942] [<ffffffff810a093f>] kthread+0xef/0x110 [ 634.896944] [<ffffffff816df16f>] ret_from_fork+0x1f/0x40 [ 634.896945] [<ffffffff810a0850>] ? kthread_park+0x60/0x60 [ 634.896970] smpboot: CPU 6 is now offline Signed-off-by: Mike Galbraith <umgwanakikbuti@xxxxxxxxx> --- kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7569,6 +7569,9 @@ int sched_cpu_dying(unsigned int cpu) nohz_balance_exit_idle(cpu); hrtick_clear(rq); if (per_cpu(idle_last_mm, cpu)) { + if (IS_ENABLED(CONFIG_PREEMPT_RT_FULL)) + mmdrop_delayed(per_cpu(idle_last_mm, cpu)); + else mmdrop(per_cpu(idle_last_mm, cpu)); per_cpu(idle_last_mm, cpu) = NULL; } -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html