Re: RED state exception (trap type 0x64) on U5 reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/30/2013 04:42 PM, Meelis Roos wrote:
Another strange symptom is that the problem did not happen when
openpromfs is compiled in statically, not loaded as module. When loaded
as module, its memory is vmalloc()ed... but that's probably too weak
connection to conclude anything.

What happens with the not-even-compile-tested debug patch below?

Now I have the results of that test. It does not trigger at all during
normal shutdown since module is not unloaded. When I unmount openpromfs
and rmmod openpromfs, both lines are promted to dmesg. After that, the
RED state still happens on reboot.

Played around some more (to reproduce the slab BUG with newer kernel
for reporting) and found 2 things:

1. When I apply the kzmalloc vs vmalloc revert patch to 3.12.0, it
breaks the serial layer with fireworks - did not investigate further.

kmalloc() should work fine on top of 3.12.0+.

Don't revert. Just change vmalloc->kmalloc and vfree->kfree. I can
supply you with a patch if you'd prefer; just let me know.

And please provide copies of the fireworks.

2. When trying plain 3.12 with no debug patches but most debug options
except SLAB ones, the RED state exception is still present but I do get
a meaningful lockdep warning just before the exception. This is very
similar to the warning I posted today for sparc64 startup on another
machine (copied below). Maybe this is just some unannotated irq stuff
(or 2 independent ones) but it happens in exactly the right spot...

I think the hardirqs warnings below and on the E3500 are because
NMI is still enabled in p1275_cmd_direct() and
arch/sparc:arch_irqs_disabled_flags() doesn't differentiate irqs on from
nmi on, which triggers the WARNING.

Does the RED state exception trigger if you manually break to the prom
command line and issue a boot command?

I'll continue to follow the SLAB bug thread in case some additional
promising lead develops there.

Regards,
Peter Hurley


The warning from Ultra 5 with RED State Exception (full dmesg and
config are below):

[info] Will now restart.
sd 0:0:0:0: [sda] Synchronizing SCSI cache
reboot: Restarting system
------------[ cut here ]------------
WARNING: CPU: 0 PID: 2826 at kernel/lockdep.c:3535 check_flags+0x7c/0x240()
DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
Modules linked in: openpromfs
CPU: 0 PID: 2826 Comm: reboot Tainted: G        W    3.12.0 #133
Call Trace:
  [0000000000454b6c] warn_slowpath_common+0x4c/0x80
  [0000000000454c4c] warn_slowpath_fmt+0x2c/0x40
  [0000000000499e3c] check_flags+0x7c/0x240
  [000000000049d000] lock_acquire+0x20/0x80
  [000000000077afe8] _raw_spin_lock+0x28/0x40
  [00000000005f8ef4] p1275_cmd_direct+0x14/0x60
  [00000000005f8980] prom_reboot+0x20/0x40
  [0000000000434c88] machine_restart+0x48/0x60
  [000000000047d9cc] kernel_restart+0x4c/0x60
  [000000000047db34] SyS_reboot+0x134/0x200
  [00000000004060b4] linux_sparc_syscall32+0x34/0x40
---[ end trace 4759822ebc3658d5 ]---
possible reason: unannotated irqs-off.
irq event stamp: 3799
hardirqs last  enabled at (3799): [<0000000000404b1c>] rtrap_xcall+0x18/0x20
hardirqs last disabled at (3797): [<0000000000459380>] __do_softirq+0x100/0x180
softirqs last  enabled at (3798): [<00000000004593dc>] __do_softirq+0x15c/0x180
softirqs last disabled at (3791): [<000000000042b89c>] do_softirq+0x5c/0xa0

RED State Exception

TL=0000.0000.0000.0005 TT=0000.0000.0000.0064
    TPC=0000.0000.f000.4c80 TnPC=0000.0000.f000.4c84 TSTATE=0000.0099.1104.1403
TL=0000.0000.0000.0004 TT=0000.0000.0000.0064
    TPC=0000.0000.f000.4c80 TnPC=0000.0000.f000.4c84 TSTATE=0000.0099.1104.1403
TL=0000.0000.0000.0003 TT=0000.0000.0000.0064
    TPC=0000.0000.f000.4c80 TnPC=0000.0000.f000.4c84 TSTATE=0000.0099.1104.1403
TL=0000.0000.0000.0002 TT=0000.0000.0000.0064
    TPC=0000.0000.f000.0c80 TnPC=0000.0000.f000.0c84 TSTATE=0000.0099.1104.1403
TL=0000.0000.0000.0001 TT=0000.0000.0000.0064
    TPC=0000.0000.f004.55c0 TnPC=0000.0000.f004.55c4 TSTATE=0000.0099.1100.1603


The warning from Sun E3500 startup:

WARNING: CPU: 6 PID: 1 at kernel/locking/lockdep.c:3535 check_flags+0x7c/0x240()
DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
Modules linked in:
CPU: 6 PID: 1 Comm: swapper/6 Not tainted 3.13.0-rc1-dirty #17
Call Trace:
  [00000000004585cc] warn_slowpath_common+0x4c/0x80
  [00000000004586ac] warn_slowpath_fmt+0x2c/0x40
  [0000000000498d9c] check_flags+0x7c/0x240
  [000000000049bf40] lock_acquire+0x20/0x80
  [000000000081e188] _raw_spin_lock+0x28/0x40
  [000000000061bd74] p1275_cmd_direct+0x14/0x60
  [000000000061bc0c] prom_startcpu+0x2c/0x40
  [000000000043e3bc] __cpu_up+0x5c/0x180
  [0000000000458830] _cpu_up.constprop.1+0xd0/0x160
  [0000000000458958] cpu_up+0x58/0x80
  [00000000009fe2b4] smp_init+0x74/0xbc
  [00000000009f49e4] kernel_init_freeable+0x7c/0x110
  [000000000080af24] kernel_init+0x4/0x120
  [00000000004060c4] ret_from_fork+0x1c/0x2c
  [0000000000000000]           (null)
---[ end trace e61cc8445001155f ]---
possible reason: unannotated irqs-off.
irq event stamp: 2051
hardirqs last  enabled at (2051): [<000000000081e9d8>] _raw_spin_unlock_irqrestore+0x38/0x60
hardirqs last disabled at (2050): [<000000000081e234>] _raw_spin_lock_irqsave+0x14/0x60
softirqs last  enabled at (398): [<000000000045d098>] __do_softirq+0x178/0x200
softirqs last disabled at (393): [<000000000042bb8c>] do_softirq_own_stack+0x2c/0x40

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Kernel Development]     [DCCP]     [Linux ARM Development]     [Linux]     [Photo]     [Yosemite Help]     [Linux ARM Kernel]     [Linux SCSI]     [Linux x86_64]     [Linux Hams]

  Powered by Linux