On (10/22/18 13:30), Roosen Henri wrote: > Hi RT-experts, > > One of our ARM iMX6Q systems running an SMP PREEMPT RT kernel version > based on v4.14.52-rt34 triggered a kernel-oops after about 60 days > duration test (running cyclic-test and additional load apps). > > Looking at the trace (see https://paste.debian.net/1048486/), PC is > incorrect. LR is c016816c, so the processor just executed the > instruction at c0168168: > > c0168168: e12fff33 blx r3 > (kernel/printk/printk.c:1666) > > With r3 being ee4fe01c, the processor has jumped to that location > causing the OOPS. > > Looking at the console_unlock disassembly (see > https://paste.debian.net/1048485/), r3 should have the con->write() > function pointer. The console pointer con is retrieved while walking > the console_drivers list. > > So, I guess the list gets corrupted, maybe some kind of concurrency > issue? Unfortunately there are a lot of locking primitives used in the > console_unlock() function and some RT-specific code, so it's not so > easy to find the root-cause at first glance. Could someone of you have > a look? Did you register/unregister consoles during the test? Nothing else should modify the console drivers list. -ss