On 02/26/2016, 01:38 AM, Linus Torvalds wrote: > On Thu, Feb 25, 2016 at 1:32 PM, Jiri Slaby <jslaby@xxxxxxx> wrote: >> >> Interestingly, RBP contains address inside try_to_wake_up -- >> ffffffff810a535a (dunno why) which is: >> ffffffff810a5355: e8 66 a0 ff ff callq ffffffff8109f3c0 >> <ttwu_stat> >> ffffffff810a535a: e9 9d fe ff ff jmpq ffffffff810a51fc >> <try_to_wake_up+0x3c> >> >> ttwu_stat does in the begginning: >> mov $0x16e80,%r14 >> >> which is what we actually still have in r14 when it crashes. The first >> ttwu_stat's "if" has to go through the true branch (otherwise r14 would >> be overwritten). > > Hmm. That does sound very much like it might be ttwu_stat() that has > gotten the stack frame wrong, and when finishes exits, it does > > popq %rbp > ret > > but in fact it popped the return address, and then returned to a crazy address. > > Which sounds like a corrupted stack pointer (not a corrupted stack). > > Can you make just the "vmlinux" file available somewhere? Sure, both vmlinux w/ its separated .debuginfo sections vmlinux.debug are at: http://labs.suse.cz/jslaby/bug-968218/ There is also core.s which is a result of: objdump -d vmlinux-4.4.2-3-default | grep -A 10000 '<update_rq_clock>:' >core.s > In my own private configuration, ttwu_stat() doesn't actually touch > the stack at all - no stack pointer action anywhere except for the > > ttwu_stat: > 1: call __fentry__ > pushq %rbp > .. > movq %rsp, %rbp #, > > ..... > > popq %rbp > ret > > but yeah, as Peter says, maybe an exception screwed up %rsp somehow.. Lucky you. My ttwu_stat does a bit more stack save-restoring. But all seem to be paired: ffffffff8109f3c0 <ttwu_stat>: ffffffff8109f3c0: e8 fb ca 60 00 callq ffffffff816abec0 <__fentry__> ffffffff8109f3c5: 55 push %rbp ffffffff8109f3c6: 48 89 e5 mov %rsp,%rbp ffffffff8109f3c9: 41 57 push %r15 ffffffff8109f3cb: 41 56 push %r14 ffffffff8109f3cd: 41 55 push %r13 ffffffff8109f3cf: 41 54 push %r12 ffffffff8109f3d1: 49 c7 c6 80 6e 01 00 mov $0x16e80,%r14 ffffffff8109f3d8: 53 push %rbx ... ffffffff8109f48c: 5b pop %rbx ffffffff8109f48d: 41 5c pop %r12 ffffffff8109f48f: 41 5d pop %r13 ffffffff8109f491: 41 5e pop %r14 ffffffff8109f493: 41 5f pop %r15 ffffffff8109f495: 5d pop %rbp ffffffff8109f496: c3 retq > I really don't see how it would happen here - that code doesn't look > particularly odd. > > And the fentry code used by the function tracer can certainly screw > things up, but even that would be hard-pressed to screw up %rbp, since > the saving of rbp comes *after* fentry. Old pre-__fentry__ gcc > versions had a much higher likelihood (the whole mcount thing is a > disaster, but I'm assuming you have a compiler that does __fentry__ > and have CC_USING_FENTRY set?) Yep, -mfentry in use obviously from the dump above, it is compiled by gcc 5.3.1 rev231346. thanks, -- js suse labs -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html