On Tue, 9 Mar 2021 13:34:42 -0800 Daniel Xu <dxu@xxxxxxxxx> wrote: > Hi Masami, > > Just want to clarify a few points: > > On Mon, Mar 08, 2021 at 11:52:10AM +0900, Masami Hiramatsu wrote: > > On Sun, 7 Mar 2021 13:23:33 -0800 > > Daniel Xu <dxu@xxxxxxxxx> wrote: > > To help your understanding, let me explain. > > > > If we have a code here > > > > caller_func: > > 0x00 add sp, 0x20 /* 0x20 bytes stack frame allocated */ > > ... > > 0x10 call target_func > > 0x15 ... /* return address */ > > > > On the stack in the entry of target_func, we have > > > > [stack] > > 0x0e0 caller_func+0x15 > > ... /* 0x20 bytes = 4 entries are stack frame of caller_func */ > > 0x100 /* caller_func return address */ > > > > And when we put a kretprobe on the target_func, the stack will be > > > > [stack] > > 0x0e0 kretprobe_trampoline > > ... /* 0x20 bytes = 4 entries are stack frame of caller_func */ > > 0x100 /* caller_func return address */ > > > > * "caller_func+0x15" is saved in current->kretprobe_instances.first. > > > > When returning from the target_func, call consumed the 0x0e0 and > > jump to kretprobe_trampoline. Let's see the assembler code. > > > > ".text\n" > > ".global kretprobe_trampoline\n" > > ".type kretprobe_trampoline, @function\n" > > "kretprobe_trampoline:\n" > > /* We don't bother saving the ss register */ > > " pushq %rsp\n" > > " pushfq\n" > > SAVE_REGS_STRING > > " movq %rsp, %rdi\n" > > " call trampoline_handler\n" > > /* Replace saved sp with true return address. */ > > " movq %rax, 19*8(%rsp)\n" > > RESTORE_REGS_STRING > > " popfq\n" > > " ret\n" > > > > When the entry of trampoline_handler, stack is like this; > > > > [stack] > > 0x040 kretprobe_trampoline+0x25 > > 0x048 r15 > > ... /* pt_regs */ > > 0x0d8 flags > > 0x0e0 rsp (=0x0e0) > > ... /* 0x20 bytes = 4 entries are stack frame of caller_func */ > > 0x100 /* caller_func return address */ > > > > And after returned from trampoline_handler, "movq" changes the > > stack like this. > > > > [stack] > > 0x040 kretprobe_trampoline+0x25 > > 0x048 r15 > > ... /* pt_regs */ > > 0x0d8 flags > > 0x0e0 caller_func+0x15 > > ... /* 0x20 bytes = 4 entries are stack frame of caller_func */ > > 0x100 /* caller_func return address */ > > Thanks for the detailed explanation. I think I understand kretprobe > mechanics from a somewhat high level (kprobe saves real return address > on entry, overwrites return address to trampoline, then trampoline > runs handler and finally resets return address to real return address). > > I don't usually write much assembly so the details escape me somewhat. > > > So at the kretprobe handler, we have 2 issues. > > 1) the return address (caller_func+0x15) is not on the stack. > > this can be solved by searching from current->kretprobe_instances. > > Yes, agreed. > > > 2) the stack frame size of kretprobe_trampoline is unknown > > Since the stackframe is fixed, the fixed number (0x98) can be used. > > I'm confused why this is relevant. Is it so ORC knows where to find > saved return address in the frame? No, because the kretprobe_trampoline is somewhat special. Usually, at the function entry, there is a return address on the top of stack, but kretprobe_trampoline doesn't have it. So we have to put a hint at the function entry to mark there should be a next return address. (and ORC unwinder must find it) > > However, those solutions are only for the kretprobe handler. The stacktrace > > from interrupt handler hit in the kretprobe_trampoline still doesn't work. > > > > So, here is my idea; > > > > 1) Change the trampline code to prepare stack frame at first and save > > registers on it, instead of "push". This will makes ORC easy to setup > > stackframe information for this code. > > I'm confused on the details here. But this is what Josh solves in his > patch, right? I'm not so sure how objtool makes the ORC information. If it can trace the push/pop correctly, yes, it is solved. > > 2) change the return addres fixup timing. Instead of using return value > > of trampoline handler, before removing the real return address from > > current->kretprobe_instances. > > Is the idea to have `kretprobe_trampoline` place the real return address > on the stack (while telling ORC where to find it) _before_ running `call > trampoline_handler` ? So that an unwind from inside the user defined > kretprobe handler simply unwinds correctly? No, unless calling the trampoline_handler, we can not get the real return address. Thus, the __kretprobe_trampoline_handler() will call return address fixup function right before unlink the current->kretprobe_instances. > And to be extra clear, this would only work for stack_trace_save() and > not stack_trace_save_regs()? Yes, for the stack_trace_save_regs() and the stack-tracing inside the kretprobe'd target function, we still need a hack as same as orc_ftrace_find(). > > > 3) Then, if orc_find() finds the ip is in the kretprobe_trampoline, it > > checks the contents of the end of stackframe (at the place of regs->sp) > > is same as the address of it. If it is, it can find the correct address > > from current->kretprobe_instances. If not, there is a correct address. > > What do you mean by "it" w.r.t. "is the same address of it"? I'm > confused on this point. Oh I meant, 3) Then, if orc_find() finds the ip is in the kretprobe_trampoline, orc_find() checks the contents of the end of stackframe (at the place of regs->sp) is same as the address of the stackframe (Note that kretprobe_trampoline does "push %sp" at first). If so, orc_find() can find the correct address from current->kretprobe_instances. If not, there is a correct address. I need to see the orc unwinder carefully, orc_find() only gets the ip but to find stackframe, I think this should be fixed in the caller of orc_find(). Thank you, -- Masami Hiramatsu <mhiramat@xxxxxxxxxx>