Re: [PATCH RFC bpf-next 4/3] uprobe: ensure sys_uretprobe uses sysret

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 19, 2024 at 12:32 PM Jiri Olsa <olsajiri@xxxxxxxxx> wrote:
>
> On Tue, Mar 19, 2024 at 12:08:35PM +0100, Jiri Olsa wrote:
> > On Tue, Mar 19, 2024 at 11:25:24AM +0100, Oleg Nesterov wrote:
> > > Obviously not for inclusion yet ;) untested, lacks the comments, and I am not
> > > sure it makes sense.
> > >
> > > But I am wondering if this change can speedup uretprobes a bit more. Any chance
> > > you can test it?
> > >
> > > With 1/3 sys_uretprobe() changes regs->r11/cx, this is correct but implies iret.
> > > See the /* SYSRET requires RCX == RIP and R11 == EFLAGS */ code in do_syscall_64().
> >
> > nice idea, looks like sysexit should be faster
> >
> > >
> > > With this patch uretprobe_syscall_entry restores rcx/r11 itself and does retq,
> > > sys_uretprobe() needs to hijack regs->ip after uprobe_handle_trampoline() to
> > > make it possible.
> > >
> > > Comments?
> > >
> > > Oleg.
> > > ---
> > >
> > > diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> > > index 069371e86180..b99f1d80a8c8 100644
> > > --- a/arch/x86/kernel/uprobes.c
> > > +++ b/arch/x86/kernel/uprobes.c
> > > @@ -319,6 +319,9 @@ asm (
> > >     "pushq %r11\n"
> > >     "movq $462, %rax\n"
> > >     "syscall\n"
> > > +   "popq %r11\n"
> > > +   "popq %rcx\n"
> > > +   "retq\n"
> >
> > using rax space on stack for return pointer, cool :)
> >
> > I'll run the test with this change
>
> I got bigger speed up on intel, amd stays the same (I'll double check that)
>
> current:
>   base           :   16.133 ± 0.035M/s
>   uprobe-nop     :    3.003 ± 0.002M/s
>   uprobe-push    :    2.829 ± 0.001M/s
>   uprobe-ret     :    1.101 ± 0.001M/s
>   uretprobe-nop  :    1.485 ± 0.001M/s
>   uretprobe-push :    1.447 ± 0.000M/s
>   uretprobe-ret  :    0.812 ± 0.000M/s
>
> fix:
>   base           :   16.522 ± 0.103M/s
>   uprobe-nop     :    2.920 ± 0.034M/s
>   uprobe-push    :    2.749 ± 0.047M/s
>   uprobe-ret     :    1.094 ± 0.003M/s
>   uretprobe-nop  :    2.004 ± 0.006M/s  < ~34% speed up
>   uretprobe-push :    1.940 ± 0.003M/s  < ~34% speed up
>   uretprobe-ret  :    0.921 ± 0.050M/s  < ~13% speed up
>
> original fix:
>   base           :   15.704 ± 0.076M/s
>   uprobe-nop     :    2.841 ± 0.008M/s
>   uprobe-push    :    2.666 ± 0.029M/s
>   uprobe-ret     :    1.037 ± 0.008M/s
>   uretprobe-nop  :    1.718 ± 0.010M/s  < ~25% speed up
>   uretprobe-push :    1.658 ± 0.008M/s  < ~23% speed up
>   uretprobe-ret  :    0.853 ± 0.004M/s  < ~14% speed up
>

My machine is slower, even though I turned out mitigations and stuff
like that, I feel like there are still some slow downs. But either
way, data is at least consistent as far as baseline goes (it's called
syscall-count now in my local changes I'm yet to submit), and yes,
Oleg's change does produce a noticeable speed up:

baseline
========
usermode-count :   79.509 ± 0.038M/s
syscall-count  :    9.550 ± 0.002M/s
uprobe-nop     :    1.530 ± 0.000M/s
uprobe-push    :    1.457 ± 0.000M/s
uprobe-ret     :    0.642 ± 0.000M/s
uretprobe-nop  :    0.777 ± 0.000M/s
uretprobe-push :    0.761 ± 0.000M/s
uretprobe-ret  :    0.459 ± 0.000M/s

Jiri
====
usermode-count :   79.515 ± 0.014M/s
syscall-count  :    9.439 ± 0.006M/s
uprobe-nop     :    1.520 ± 0.001M/s
uprobe-push    :    1.464 ± 0.000M/s
uprobe-ret     :    0.640 ± 0.000M/s
uretprobe-nop  :    0.893 ± 0.000M/s (+15%)
uretprobe-push :    0.867 ± 0.000M/s (+14%)
uretprobe-ret  :    0.498 ± 0.000M/s (+8.5%)

Oleg+Jiri
=========
usermode-count :   79.471 ± 0.078M/s
syscall-count  :    9.434 ± 0.007M/s
uprobe-nop     :    1.516 ± 0.003M/s
uprobe-push    :    1.463 ± 0.000M/s
uprobe-ret     :    0.640 ± 0.001M/s
uretprobe-nop  :    1.020 ± 0.001M/s (+31%)
uretprobe-push :    0.998 ± 0.001M/s (+31%)
uretprobe-ret  :    0.537 ± 0.000M/s (+17%)

So it's 2x of just Jiri's changes, which is a very nice boost! I only
tested on Intel CPU.


>
> jirka





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux