RE: v3.18-RT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sebastian,

You wrote:
> One thing on the bisect. The git tree has the patches in this order:
>  (1) kernel: migrate_disable() do fastpath in atomic & irqs-off
>  (2) kernel: softirq: unlock with irqs on
> 
> but you need apply Patch #2 before #1. So if you bisect and you hit
> warnings due to #1 please note that need apply #2.
> 
> T01 and T02 show probably the same issue but there are too many
> warnings comming in parallel. If this comes from the sched patch due
> #1/#2 mix up then don't bisect here or have them both applied.
> The call path itself does look special as it would violate the rule
> of atomic locking / unlocking (as it was fixed in #2 for instance).
> At this point I assume that your bisect went wrong due to patch
> #1/#2.

The traces were produced using the original 3.18.29-rt30 kernel (with all patches) plus the addition of 
WARN_ON_ONCE(p->migrate_disable_atomic <= 0) in migrate_enable() and CONFIG_SCHED_DEBUG=y.

When I revert only patch #1, from the 3.18.29-rt30 kernel, the kernel never crashes. I've been performing long-running tests on a dual Xeon system and a quad-core i7 system with patch #1 reverted.

Cheers,
Carol

> -----Original Message-----
> From: Sebastian Andrzej Siewior [mailto:bigeasy@xxxxxxxxxxxxx]
> Sent: Thursday, September 08, 2016 6:45 AM
> To: Carol Wong
> Cc: linux-rt-users@xxxxxxxxxxxxxxx; David Hauck; Preston Hauck
> Subject: Re: v3.18-RT
> 
> On 2016-08-19 00:41:46 [+0000], Carol Wong wrote:
> > Hi Sebastian,
> Hi Carol,
> 
> > Were you able to gain any insight from the traces?
> 
> not really. T00 shows a fault in
> [    2.756284] BUG: unable to handle kernel NULL pointer dereference
> at 00000004
> [    2.756289] IP: [<c11653e7>] kmem_cache_alloc+0x87/0x230
> from ida_pre_get() / create_worker(). That is quite late so I have no
> idea why that would happen.
> The other two are not really help full.
> 
> > If we were to proceed with reverting the kernel/sched/core.c patch
> in our build of 3.18.29-rt30, would the addition of the
> WARN_ON_ONCE(p->migrate_disable_atomic <= 0) debug check that you
> recommended (2016/07/29) be sufficient for detecting imbalances? We
> would perform extended testing on multiple systems to determine the
> effects of reverting the patch.
> 
> One thing on the bisect. The git tree has the patches in this order:
>  (1) kernel: migrate_disable() do fastpath in atomic & irqs-off
>  (2) kernel: softirq: unlock with irqs on
> 
> but you need apply Patch #2 before #1. So if you bisect and you hit
> warnings due to #1 please note that need apply #2.
> 
> T01 and T02 show probably the same issue but there are too many
> warnings comming in parallel. If this comes from the sched patch due
> #1/#2 mix up then don't bisect here or have them both applied.
> The call path itself does look special as it would violate the rule
> of atomic locking / unlocking (as it was fixed in #2 for instance).
> At this point I assume that your bisect went wrong due to patch
> #1/#2.
> 
> > Cheers,
> > Carol
> >
> Sebastian
��.n��������+%������w��{.n�����{�����ǫ���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux