Hi Steven, many thanks for having a look! Steven Rostedt <rostedt@xxxxxxxxxxx> writes: > I just found this in my Inbox, and this looks to be a legit issue. > > On Thu, 26 Jul 2018 12:40:29 +0200 > Nicolai Stange <nstange@xxxxxxx> wrote: > > You still working on this? Yes, this still needs to get fixed somehow, preferably at the ftrace layer. > >> With dynamic ftrace, ftrace patches call sites in a three steps: >> 1. Put a breakpoint at the to be patched location, >> 2. update call site and >> 3. finally remove the breakpoint again. >> >> Note that the breakpoint ftrace_int3_handler() simply makes execution >> to skip over the to be patched function. >> >> This patching happens in the following circumstances: >> - the global ftrace_trace_function changes and the call sites at >> ftrace_call and ftrace_regs_call get patched, >> - an ftrace_ops' ->func changes and the call site in its trampoline gets >> patched, >> - graph tracing gets enabled/disabled and the jump site at >> ftrace_graph_call gets patched >> - a mcount site gets converted from nop -> call, call -> nop >> or call -> call. >> >> The latter case, i.e. a mcount call getting redirected, happens for example >> in a transition from trampolined to not trampolined upon a user enabling >> function tracing on a live patched function. >> >> The ftrace_int3_handler() simply skipping over the mcount callsite is >> problematic here, because it means that a live patched function gets >> temporarily reverted to its unpatched original and this breaks live >> patching consistency. >> >> Make ftrace_int3_handler not to skip over the fops invocation, but modify >> the interrupted control flow to issue a call as needed. >> >> For determining what the proper action actually is, modify >> update_ftrace_func() to take an extra argument, func, to be called if the >> corresponding breakpoint gets hit. Introduce a new global variable, >> trace_update_func_dest and let update_ftrace_func() set it. For the special >> case of patching the jmp insn at ftrace_graph_call, set it to zero and make >> ftrace_int3_handler() just skip over the insn in this case as before. >> >> Because there's no space left above the int3 handler's iret frame to stash >> an extra call's return address in, this can't be mimicked from the >> ftrace_int3_handler() itself. >> >> Instead, make ftrace_int3_handler() change the iret frame's %rip slot to >> point to the new ftrace_int3_call_trampoline to be executed immediately >> after iret. >> >> The original %rip gets communicated to ftrace_int3_call_trampoline which >> can then take proper action. Note that ftrace_int3_call_trampoline can >> nest, because of NMIs, for example. In order to avoid introducing another >> stack, abuse %r11 for passing the original %rip. This is safe, because the >> interrupted context is always at a call insn and according to the x86_64 >> ELF ABI allows %r11 is callee-clobbered. According to the glibc sources, >> this is also true for the somewhat special mcount/fentry ABI. >> >> OTOH, a spare register doesn't exist on i686 and thus, this is approach is >> limited to x86_64. >> >> For determining the call target address, let ftrace_int3_call_trampoline >> compare ftrace_update_func against the interrupted %rip. > > I don't think we need to do the compare. > >> If they match, an update_ftrace_func() is currently pending and the >> call site is either of ftrace_call, ftrace_regs_call or the call insn >> within a fops' trampoline. Jump to the ftrace_update_func_dest location as >> set by update_ftrace_func(). >> >> If OTOH the interrupted %rip doesn't equal ftrace_update_func, then >> it points to an mcount call site. In this case, redirect control flow to >> the most generic handler, ftrace_regs_caller, which will then do the right >> thing. >> >> Finally, reading ftrace_update_func and ftrace_update_func_dest from >> outside of the int3 handler requires some care: inbetween adding and >> removing the breakpoints, ftrace invokes run_sync() which basically >> issues a couple of IPIs. Because the int3 handler has interrupts disabled, >> it orders with run_sync(). >> >> To extend that ordering to also include ftrace_int3_call_trampoline, make >> ftrace_int3_handler() disable interrupts within the iret frame returning to >> it. >> >> Store the original EFLAGS.IF into %r11's MSB which, because it represents >> a kernel address, is redundant. Make ftrace_int3_call_trampoline restore >> it when done. > > This can be made much simpler by making a hardcoded ftrace_int3_tramp > that does the following: > > ftrace_int3_tramp > push %r11 > jmp ftrace_caller But wouldn't this mean that ftrace_caller could nest if the breakpoint in question happened to be placed at ftrace_call? Infinite recursion set aside, the ip value determined from inner calls based on the on-stack return address would be wrong, AFAICS. Or am I missing something here? > The ftrace_caller will either call a single ftrace callback, if there's > only a single ops registered, or it will call the loop function that > iterates over all the ftrace_ops registered and will call each function > that matches the ftrace_ops hashes. > > In either case, it's what we want. Ok, under the assumption that my above point is valid, this patch could still get simplified a lot by having two trampolines: 1.) Your ftrace_int3_tramp from above, to be used if the patched insn is some mcount call site. The live patching fops will need ftrace_regs_caller though. So, ftrace_int3_tramp_regs_caller: push %r11 jmp ftrace_regs_caller 2.) Another one redirecting control flow to ftrace_ops_list_func(). It's going to be used when the int3 is found at ftrace_call or ftrace_regs_call resp. their counterparts in some ftrace_ops' trampoline. ftrace_int3_tramp_list_func: push %r11 jmp ftrace_ops_list_func ftrace_int3_handler() would then distinguish the following cases: 1.) ip == ftrace_graph_call => ignore, i.e. skip the insn 2.) is_ftrace_caller(ip) => ftrace_int3_tramp_list_func 3.) ftrace_location(ip) => ftrace_int3_tramp_regs_caller ftrace_location() is getting invoked from ftrace_int3_handler() already, so there wouldn't be any additional cost. If that makes sense to you, I'd prepare a patch... > The ftrace_int3_tramp will simply simulate the call ftrace_caller that > would be there as the default caller if more than one function is set > to it. > > For 32 bit, we could add 4 variables on the thread_info and make 4 > trampolines, one for each context (normal, softirq, irq and NMI), and > have them use the variable stored in the thread_info for that location: > > ftrace_int3_tramp_normal > push %eax # just for space > push %eax > mov whatever_to_get_thread_info, %eax > mov normal(%eax), %eax > mov %eax, 4(%esp) > pop %eax > jmp ftrace_caller > > ftrace_int3_tramp_softiqr > push %eax # just for space > push %eax > mov whatever_to_get_thread_info, %eax > mov softirq(%eax), %eax > mov %eax, 4(%esp) > pop %eax > jmp ftrace_caller > > etc.. > > > A bit crazier for 32 bit but it can still be done. ;-) Ok, but currently CONFIG_HAVE_LIVEPATCH=n for x86 && !x86_64, so I'd rather not invest too much energy into screwing 32 bit up ;) Thanks! Nicolai -- SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)