On Fri, Jul 28, 2017 at 10:54:37PM -0500, Josh Poimboeuf wrote: > On Fri, Jul 28, 2017 at 07:59:12PM +0000, Levin, Alexander (Sasha Levin) wrote: > > On Fri, Jul 28, 2017 at 01:57:20PM -0500, Josh Poimboeuf wrote: > > >Thanks, that's much better. I'm relieved the unwinder didn't screw that > > >up, at least. > > > > > >This looks like a tricky one. Is it easily recreatable? > > > > Yeah, I just hit it again with slightly different initial calls: > > Sasha sent me some data privately. As I suspected, the cause is some > bad ORC data. Objtool incorrectly assumes that once the frame pointer > is set up, it no longer gets touched. So as it turns out, my pre-vacation self was completely wrong. The problem is actually some runtime instruction patching which affects the accuracy of the ORC data. Take for example the lock_is_held_type() function. In vmlinux, it has the following instruction: callq *0xffffffff85a94880 (pv_irq_ops.save_fl) At runtime, that instruction is patched and replaced with a fast inline version of arch_local_save_flags() which eliminates the call: pushfq pop %rax The problem is when an interrupt hits after the push: pushfq --- irq --- pop %rax The push offsets the stack pointer by 8 bytes, confusing the ORC unwinder when it tries to unwind from the IRQ. The race should be somewhat rare, though Sasha has no problems hitting it with syzkaller. I'm not sure what the solution should be. It will probably need to be one of the following: a) either don't allow runtime "alternative" patches to mess with the stack pointer (objtool could enforce this); or b) come up with some way to register such patches with the ORC unwinder at runtime. I haven't looked much at either option to see how feasible they would be. Any thoughts/opinions? -- Josh -- To unsubscribe from this list: send the line "unsubscribe live-patching" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html