On Thu, Sep 28, 2017 at 08:03:26AM +0200, Juergen Gross wrote: > On 27/09/17 23:08, Josh Poimboeuf wrote: > > On Tue, Aug 08, 2017 at 01:09:08PM -0700, Andy Lutomirski wrote: > >> On Tue, Aug 8, 2017 at 12:13 PM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote: > >>> On Tue, Aug 08, 2017 at 12:03:51PM -0700, Linus Torvalds wrote: > >>>> On Tue, Aug 8, 2017 at 11:58 AM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote: > >>>>> > >>>>> Take for example the lock_is_held_type() function. In vmlinux, it has > >>>>> the following instruction: > >>>>> > >>>>> callq *0xffffffff85a94880 (pv_irq_ops.save_fl) > >>>>> > >>>>> At runtime, that instruction is patched and replaced with a fast inline > >>>>> version of arch_local_save_flags() which eliminates the call: > >>>>> > >>>>> pushfq > >>>>> pop %rax > >>>>> > >>>>> The problem is when an interrupt hits after the push: > >>>>> > >>>>> pushfq > >>>>> --- irq --- > >>>>> pop %rax > >>>> > >>>> That should actually be something easily fixable, for an odd reason: > >>>> the instruction boundaries are different. > >>>> > >>>>> I'm not sure what the solution should be. It will probably need to be > >>>>> one of the following: > >>>>> > >>>>> a) either don't allow runtime "alternative" patches to mess with the > >>>>> stack pointer (objtool could enforce this); or > >>>>> > >>>>> b) come up with some way to register such patches with the ORC > >>>>> unwinder at runtime. > >>>> > >>>> c) just add ORC data for the alternative statically and _unconditionally_. > >>>> > >>>> No runtime registration. Just an unconditional entry for the > >>>> particular IP that comes after the "pushfq". It cannot match the > >>>> "callq" instruction, since it would be in the middle of that > >>>> instruction. > >>>> > >>>> Basically, just do a "union" of the ORC data for all the alternatives. > >>>> > >>>> Now, objtool should still verify that the instruction pointers for > >>>> alternatives are unique - or that they share the same ORC unwinder > >>>> information if they are not. > >>>> > >>>> But in cases like this, when the instruction boundaires are different, > >>>> things should "just work", with no need for any special cases. > >>>> > >>>> Hmm? > >>> > >>> Yeah, that might work. Objtool already knows about alternatives, so it > >>> might not be too hard. I'll try it. > >> > >> But this one's not an actual alternative, right? It's a pv op. > >> > >> I would advocate that we make it an alternative after all. I frickin' > >> hate the PV irq ops. It would like roughly like this: > >> > >> ALTERNATIVE "pushfq; popq %rax", "callq *pv_irq_ops.save_fl", > >> X86_FEATURE_GODDAMN_PV_IRQ_OPS > >> > >> (The obvious syntax error and the naming should probably be fixed. > >> Also, this needs to live in an #ifdef because it needs to build on > >> kernels with pv support. It should also properly register itself as a > >> pv patch site.) > > > > I've got a prototype of the above working, where vmlinux shows: > > > > pushfq > > pop %rax > > nop > > nop > > nop > > nop > > nop > > > > instead of: > > > > callq *0xffffffff81e3a400 (pv_irq_ops.save_fl) > > > > Which is nice because the vmlinux disassembly now matches the most > > common runtime cases (everything except Xen and vsmp). And it also > > fixes the upthread objtool issue. > > > > The only slight issue with the patches is that hypervisors need access > > to the pv ops much earlier than when alternatives are applied. So I had > > to add a new .pv_alternatives section for these pv ops alternatives, so > > they can be patched very early, if running in a hypervisor. > > > > Will clean up the code and post something relatively soon. > > > > Are you combining alternatives and pvops then? I'm asking because in an > up and running system under Xen the callq *... will be replaced with a > much faster "call xen_save_fl". This should still be the case after > your patch. Right, it's not combining alternatives and pv ops, it's just adding another step. So first the pushfq; pop %rax is replaced with callq *pv_irq_ops.save_fl and then later, after the xen ops structs are finalized, it's replaced with callq xen_save_fl -- Josh -- To unsubscribe from this list: send the line "unsubscribe live-patching" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html