+Josh and PeterZ On Mon, Sep 27, 2021, Dmitry Vyukov wrote: > On Wed, 22 Sept 2021 at 01:34, 'Sean Christopherson' via > syzkaller-bugs <syzkaller-bugs@xxxxxxxxxxxxxxxx> wrote: > > > > On Fri, Sep 17, 2021, Dmitry Vyukov wrote: > > > On Fri, 17 Sept 2021 at 13:04, Marco Elver <elver@xxxxxxxxxx> wrote: > > > > > So it looks like in both cases the top fault frame is just wrong. But > > > > > I would assume it's extracted by arch-dependent code, so it's > > > > > suspicious that it affects both x86 and arm64... > > > > > > > > > > Any ideas what's happening? > > > > > > > > My suspicion for the x86 case is that kvm_fastop_exception is related > > > > to instruction emulation and the fault occurs in an emulated > > > > instruction? > > > > > > Why would the kernel emulate a plain MOV? > > > 2a: 4c 8b 21 mov (%rcx),%r12 > > > > > > And it would also mean a broken unwind because the emulated > > > instruction is in __d_lookup, so it should be in the stack trace. > > > > kvm_fastop_exception is a red herring. It's indeed related to emulation, and > > while MOV emulation is common in KVM, that emulation is for KVM guests not for > > the host kernel where this splat occurs (ignoring the fact that the "host" is > > itself a guest). > > > > kvm_fastop_exception is out-of-line fixup, and certainly shouldn't be reachable > > via d_lookup. It's also two instruction, XOR+RET, neither of which are in the > > code stream. > > > > IIRC, the unwinder gets confused when given an IP that's in out-of-line code, > > e.g. exception fixup like this. If you really want to find out what code blew > > up, you might be able to objdump -D the kernel and search for unique, matching > > disassembly, e.g. find "jmpq 0xf86d288c" and go from there. > > Hi Sean, > > Thanks for the info. > > I don't want to find out what code blew (it's __d_lookup). > I am interested in getting the unwinder fixed to output truthful and > useful frames. I was asking about the exact location to confirm that the explosion is indeed from exception fixup, which is the "unwinder scenario get confused" I was thinking of. Based on the disassembly from syzbot, that does indeed appear to be the case here, i.e. this 2a: 4c 8b 21 mov (%rcx),%r12 is from exception fixup from somewhere in __d_lookup (can't tell exactly what it's from, maybe KASAN?). > Is there more info on this "the unwinder gets confused"? Bug filed > somewhere or an email thread? Is it on anybody's radar? I don't know if there's a bug report or if this is on anyone's radar. The issue I've encountered in the past, and what I'm pretty sure is being hit here, is that the ORC unwinder doesn't play nice with out-of-line fixup code, presumably because there are no tables for the fixup. I believe kvm_fastop_exception() gets blamed because it's the first label that's found when searching back through the tables.