On Tue, Mar 19, 2024 at 4:08 AM Jiri Olsa <olsajiri@xxxxxxxxx> wrote: > > On Tue, Mar 19, 2024 at 11:25:24AM +0100, Oleg Nesterov wrote: > > Obviously not for inclusion yet ;) untested, lacks the comments, and I am not > > sure it makes sense. > > > > But I am wondering if this change can speedup uretprobes a bit more. Any chance > > you can test it? > > > > With 1/3 sys_uretprobe() changes regs->r11/cx, this is correct but implies iret. > > See the /* SYSRET requires RCX == RIP and R11 == EFLAGS */ code in do_syscall_64(). > > nice idea, looks like sysexit should be faster > > > > > With this patch uretprobe_syscall_entry restores rcx/r11 itself and does retq, > > sys_uretprobe() needs to hijack regs->ip after uprobe_handle_trampoline() to > > make it possible. > > > > Comments? > > > > Oleg. > > --- > > > > diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c > > index 069371e86180..b99f1d80a8c8 100644 > > --- a/arch/x86/kernel/uprobes.c > > +++ b/arch/x86/kernel/uprobes.c > > @@ -319,6 +319,9 @@ asm ( > > "pushq %r11\n" > > "movq $462, %rax\n" > > "syscall\n" > > + "popq %r11\n" > > + "popq %rcx\n" > > + "retq\n" > > using rax space on stack for return pointer, cool :) > > I'll run the test with this change > I can do some benchmarking on my side as well, given I have everything set up for this anyways. Thanks for the help with speeding all this up! BTW, Jiri, what are you plans regarding sys_uprobe (entry probe optimization through syscall), while we are on the topic? > thanks, > jirka > > > ".global uretprobe_syscall_end\n" > > "uretprobe_syscall_end:\n" > > ".popsection\n" > > @@ -336,23 +339,20 @@ void *arch_uprobe_trampoline(unsigned long *psize) > > SYSCALL_DEFINE0(uretprobe) > > { > > struct pt_regs *regs = task_pt_regs(current); > > - unsigned long sregs[3], err; > > + unsigned long __user *ax_and_ret = (unsigned long __user *)regs->sp + 2; > > + unsigned long ip, err; > > > > - /* > > - * We set rax and syscall itself changes rcx and r11, so the syscall > > - * trampoline saves their original values on stack. We need to read > > - * them and set original register values and fix the rsp pointer back. > > - */ > > - err = copy_from_user((void *) &sregs, (void *) regs->sp, sizeof(sregs)); > > - WARN_ON_ONCE(err); > > - > > - regs->r11 = sregs[0]; > > - regs->cx = sregs[1]; > > - regs->ax = sregs[2]; > > + ip = regs->ip; > > regs->orig_ax = -1; > > - regs->sp += sizeof(sregs); > > + err = get_user(regs->ax, ax_and_ret); > > + WARN_ON_ONCE(err); > > > > uprobe_handle_trampoline(regs); > > + > > + err = put_user(regs->ip, ax_and_ret); > > + WARN_ON_ONCE(err); > > + regs->ip = ip; > > + > > return regs->ax; > > } > > > >