Hi RT-experts, One of our ARM iMX6Q systems running an SMP PREEMPT RT kernel version based on v4.14.52-rt34 triggered a kernel-oops after about 60 days duration test (running cyclic-test and additional load apps). Looking at the trace (see https://paste.debian.net/1048486/), PC is incorrect. LR is c016816c, so the processor just executed the instruction at c0168168: c0168168: e12fff33 blx r3 (kernel/printk/printk.c:1666) With r3 being ee4fe01c, the processor has jumped to that location causing the OOPS. Looking at the console_unlock disassembly (see https://paste.debian.net/1048485/), r3 should have the con->write() function pointer. The console pointer con is retrieved while walking the console_drivers list. So, I guess the list gets corrupted, maybe some kind of concurrency issue? Unfortunately there are a lot of locking primitives used in the console_unlock() function and some RT-specific code, so it's not so easy to find the root-cause at first glance. Could someone of you have a look? Thanks! Henri
Attachment:
smime.p7s
Description: S/MIME cryptographic signature