Re: Cyclictest results on Sparc64 with PREEMPT_RT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Allen Pais | 2014-01-27 13:50:43 [+0530]:

>Hi,
Hi,

>[ 1143.894099] INFO: rcu_preempt self-detected stall on CPU { 36}  (t=2100 jiffies g=373 c=372 q=61)
>[ 1143.894130]   CPU[  0]: TSTATE[0000009980001602] TPC[000000000048d1ac] TNPC[000000000048d1b0] TASK[ksoftirqd/0:3]
>[ 1143.894151]              TPC[idle_cpu+0x2c/0x80] O7[cpumask_next_and+0x18/0x80] I7[find_busiest_group+0x21c/0xa40] RPC[load_balance+0xe8/0x880]

so you have CPU stall on cpu36
>
>[ 1150.135499] BUG: soft lockup - CPU#36 stuck for 23s! [swapper/36:0]
>[ 1150.135564] TPC: <rtrap_no_irq_enable+0x0/0xc>
>[ 1150.135572] RPC: <tl0_irq15+0x14/0x20>
>[ 1150.135581] I7: <tl0_irq15+0x14/0x20>

This is your NMI handler where you are right now…

>[ 1150.135582] Call Trace:
>[ 1150.135584]  [0000000000404b54] rtrap_no_irq_enable+0x0/0xc
>[ 1150.135586]  [0000000000404b54] rtrap_no_irq_enable+0x0/0xc
>[ 1150.135588]  [0000000000404b54] rtrap_no_irq_enable+0x0/0xc
>[ 1150.135590]  [0000000000404b54] rtrap_no_irq_enable+0x0/0xc
>[ 1150.135592]  [0000000000404b54] rtrap_no_irq_enable+0x0/0xc
>[ 1150.135594]  [0000000000404b54] rtrap_no_irq_enable+0x0/0xc
>[ 1150.135595]  [0000000000404b54] rtrap_no_irq_enable+0x0/0xc
>[ 1150.135597]  [0000000000404b54] rtrap_no_irq_enable+0x0/0xc
>[ 1150.135599]  [0000000000404b54] rtrap_no_irq_enable+0x0/0xc
>[ 1150.135600]  [0000000000404b54] rtrap_no_irq_enable+0x0/0xc

This is the stack where the NMI got injected. Not sure why this is more
than once here but then this the sparc64 trace I see :)

>[ 1150.135604]  [00000000004acc00] in_lock_functions+0x0/0x40
>[ 1150.135608]  [000000000080a038] add_preempt_count+0xd8/0x140
>[ 1150.135610]  [000000000080617c] __schedule+0x1c/0x500
>[ 1150.135613]  [0000000000806b7c] schedule+0x1c/0xc0
>[ 1150.135615]  [0000000000806f8c] schedule_preempt_disabled+0xc/0x40
>[ 1150.135617]  [000000000049dd10] cpu_startup_entry+0x150/0x300

And this where the CPU was before the NMI. Doesn't look blocking.
in_lock_functions() compares a few values no locking involved so the CPU
probably was here while the NMI hit and an usec later it might be an
instruction later. What I thnig is odd, is that it is exactly at the
begin of the function, not an instruction later.

>Am yet to debug what went wrong. 
>
>- Allen

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux