On Fri, 10 Nov 2017, Linus Torvalds wrote: > On Wed, Nov 8, 2017 at 9:19 PM, Fengguang Wu <fengguang.wu@xxxxxxxxx> wrote: > > > > Yes it's accessing the list. Here is the faddr2line output. > > Ok, so it's a corrupted timer list. Which is not a big surprise. > > It's > > next->pprev = pprev; > > in __hlist_del(), and the trapping instruction decodes as > > mov %rdx,0x8(%rax) > > with %rax having the value dead000000000200, > > Which is just LIST_POISON2. > > So we've deleted that entry twice - LIST_POISON2 is what hlist_del() > sets pprev to after already deleting it once. > > Although in this case it might not be hlist_del(), because > detach_timer() also sets entry->next to LIST_POISON2. > > Which is pretty bogus, we are supposed to use LIST_POISON1 for the > "next" pointer. Oh well. Nobody cares, except for the list entry > debugging code, which isn't run on the hlist cases. > > Adding Thomas Gleixner to the cc. It should not be possible to delete > the same timer twice. Right, it shouldn't. Fengguang, can you please enable: CONFIG_DEBUG_OBJECTS CONFIG_DEBUG_OBJECTS_TIMERS and try to reproduce? Debugobject should catch that hopefully. Thanks, tglx