Re: v3.18-RT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Carol Wong | 2016-07-20 20:53:21 [+0000]:

>Hi Sebastian,
Hi Carol,

>We finally traced the boot-up crash to the following patch in kernel/sched/core.c:
>
>https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.18-rt&id=62044e554f14547061afcfef7f0aceda43e28982
>
>After reverting the two-line patch in 3.18.29-rt30, the crash no longer occurs on our dual Xeon (2x12 core) system.
>
>Other observations:
>- Does not reproduce on single processor (2 and 4 core) systems
>- Reproduces under 3.18.27-rt27 and 3.18.36-rt38 on the dual Xeon
>- Does not reproduce on 3.18.27-rt26 and earlier on the dual Xeon
>- Reproduces more frequently on .29-rt30 (1 in 20 reboots) compared to .27-rt27 (1 in 100 reboots)
>
>So far we've not observed any side effects after reverting this patch.

This was part of CPU hotplug fixups. Lockdep might be broken without it
but I am not sure if is most of the time the case or just during
hotplug.

>I understand that a high core count system may not be easy to come by, so if there are diagnostics or patches you would like to try on the dual Xeon system, we can assist with that.

With that patch, migrate_disable() skips the whole preempt-lazy +
pin-cpu code if called with IRQs off. Since interrupts are disabled we
can't migrate to another so it is a possible optimsation.
It only makes a difference if migrate_disable() + migrate_enable() calls
are not in balance. The commit
  https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.18-rt&id=8d51d3a296b6ec4aebd0d6d7e1b7162cd9bf6662
is one example where I fixed the inbalance.
Do you get additional backtraces with CONFIG_SCHED_DEBUG enabled?

There is one thing the debug code does not cover, so could you please
add this chunk?

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 140ee06079b6..1f8613f77598 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3229,6 +3229,7 @@ void migrate_enable(void)
 
 	if (in_atomic() || irqs_disabled()) {
 #ifdef CONFIG_SCHED_DEBUG
+		WARN_ON_ONCE(p->migrate_disable_atomic <= 0);
 		p->migrate_disable_atomic--;
 #endif
 		return;

>Cheers,
>Carol Wong
>NetAcquire Corporation

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux