On Wed, Jun 22, 2016 at 11:26:21AM -0700, Andy Lutomirski wrote: > On Wed, Jun 22, 2016 at 11:22 AM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote: > > On Wed, Jun 22, 2016 at 10:59:23AM -0700, Andy Lutomirski wrote: > >> > So I got a chance to look at this some more. I'm thinking that to make > >> > this feature more consistently useful, we shouldn't only annotate > >> > pt_regs frames for calls to handlers; other calls should be annotated as > >> > well: preempt_schedule_irq, CALL_enter_from_user_mode, > >> > prepare_exit_to_usermode, SWAPGS, TRACE_IRQS_OFF, DISABLE_INTERRUPTS, > >> > etc. That way, the unwinder will always be able to find pt_regs from an > >> > interrupt/exception, even if starting from one of these other calls. > >> > > >> > But then, things get ugly. You have to either setup and tear down the > >> > frame for every possible call, or do a higher-level setup/teardown > >> > across multiple calls, which invalidates several assumptions in the > >> > entry code about the location of pt_regs on the stack. > >> > > >> > Also problematic is that several of the macros (like TRACE_IRQS_IRETQ) > >> > make assumptions about the location of pt_regs. And they're used by > >> > both syscall and interrupt code. So if we didn't create a frame pointer > >> > header for syscalls, we'd basically need two versions of the macros: one > >> > for irqs/exceptions and one for syscalls. > >> > > >> > So I think the cleanest way to handle this is to always allocate two > >> > extra registers on the stack in ALLOC_PT_GPREGS_ON_STACK. Then all > >> > entry code can assume that pt_regs is at a constant location, and all > >> > the above problems go away. Another benefit is that we'd only need two > >> > saves instead of three -- the pointer to pt_regs is no longer needed > >> > since pt_regs is always immediately after the frame header. > >> > > >> > I worked up a patch to implement this -- see below. It writes the frame > >> > pointer in all entry paths, including syscalls. This helps keep the > >> > code simple. > >> > > >> > The downside is a small performance penalty: with getppid()-in-a-loop on > >> > my laptop, the average syscall went from 52ns to 53ns, which is about a > >> > 2% slowdown. But I doubt it would be measurable in a real-world > >> > workload. > >> > > >> > It looks like about half the slowdown is due to the extra stack > >> > allocation (which presumably adds a little d-cache pressure on the stack > >> > memory) and the other half is due to the stack writes. > >> > > >> > I could remove the writes from the syscall path but it would only save > >> > about half a ns, and it would make the code less robust. Plus it's nice > >> > to have the consistency of having *all* pt_regs frames annotated. > >> > >> This is a bit messy, and I'm not really sure that the entry code > >> should be have to operate under constraints like this. Also, > >> convincing myself this works for NMI sounds unpleasant. > >> > >> Maybe we should go back to my idea of just listing the call sites in a table. > > > > So are you suggesting something like: > > > > .macro ENTRY_CALL func pt_regs_offset=0 > > call \func > > 1: .pushsection .entry_calls, "a" > > .long 1b - . > > .long \pt_regs_offset > > .popsection > > .endm > > > > and then change every call in the entry code to ENTRY_CALL? > > Yes, exactly, modulo whether the section name is good. hpa is > probably the authority on that. Well, as you probably know, I don't really like peppering ENTRY_CALL everywhere. :-/ Also I wonder how we could annotate the hypercalls, for example DISABLE_INTERRUPTS actually wraps the call in a push/pop pair. -- Josh -- To unsubscribe from this list: send the line "unsubscribe live-patching" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html