Re: [PATCH bpf-next 1/3] uprobe: Add uretprobe syscall to speed up return probe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 27, 2024 at 3:20 AM Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
>
> Adding uretprobe syscall instead of trap to speed up return probe.
>
> At the moment the uretprobe setup/path is:
>
>   - install entry uprobe
>
>   - when the uprobe is hit, it overwrites probed function's return address
>     on stack with address of the trampoline that contains breakpoint
>     instruction
>
>   - the breakpoint trap code handles the uretprobe consumers execution and
>     jumps back to original return address
>
> This patch replaces the above trampoline's breakpoint instruction with new
> ureprobe syscall call. This syscall does exactly the same job as the trap
> with some more extra work:
>
>   - syscall trampoline must save original value for rax/r11/rcx registers
>     on stack - rax is set to syscall number and r11/rcx are changed and
>     used by syscall instruction
>
>   - the syscall code reads the original values of those registers and
>     restore those values in task's pt_regs area
>
> Even with the extra registers handling code the having uretprobes handled
> by syscalls shows speed improvement.
>
>   On Intel (11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz)
>
>   current:
>
>     base           :   15.888 ± 0.033M/s
>     uprobe-nop     :    3.016 ± 0.000M/s
>     uprobe-push    :    2.832 ± 0.005M/s
>     uprobe-ret     :    1.104 ± 0.000M/s
>     uretprobe-nop  :    1.487 ± 0.000M/s
>     uretprobe-push :    1.456 ± 0.000M/s
>     uretprobe-ret  :    0.816 ± 0.001M/s
>
>   with the fix:
>
>     base           :   15.116 ± 0.045M/s
>     uprobe-nop     :    3.001 ± 0.045M/s
>     uprobe-push    :    2.831 ± 0.004M/s
>     uprobe-ret     :    1.102 ± 0.001M/s
>     uretprobe-nop  :    1.969 ± 0.001M/s  < 32% speedup
>     uretprobe-push :    1.905 ± 0.004M/s  < 30% speedup
>     uretprobe-ret  :    0.933 ± 0.002M/s  < 14% speedup
>
>   On Amd (AMD Ryzen 7 5700U)
>
>   current:
>
>     base           :    5.105 ± 0.003M/s
>     uprobe-nop     :    1.552 ± 0.002M/s
>     uprobe-push    :    1.408 ± 0.003M/s
>     uprobe-ret     :    0.827 ± 0.001M/s
>     uretprobe-nop  :    0.779 ± 0.001M/s
>     uretprobe-push :    0.750 ± 0.001M/s
>     uretprobe-ret  :    0.539 ± 0.001M/s
>
>   with the fix:
>
>     base           :    5.119 ± 0.002M/s
>     uprobe-nop     :    1.523 ± 0.003M/s
>     uprobe-push    :    1.384 ± 0.003M/s
>     uprobe-ret     :    0.826 ± 0.002M/s
>     uretprobe-nop  :    0.866 ± 0.002M/s  < 11% speedup
>     uretprobe-push :    0.826 ± 0.002M/s  < 10% speedup
>     uretprobe-ret  :    0.581 ± 0.001M/s  <  7% speedup
>
> Suggested-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
> Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
> ---
>  arch/x86/entry/syscalls/syscall_64.tbl |  1 +
>  arch/x86/kernel/uprobes.c              | 83 ++++++++++++++++++++++++++
>  include/linux/syscalls.h               |  2 +
>  include/linux/uprobes.h                |  2 +
>  include/uapi/asm-generic/unistd.h      |  5 +-
>  kernel/events/uprobes.c                | 18 ++++--
>  kernel/sys_ni.c                        |  2 +
>  7 files changed, 108 insertions(+), 5 deletions(-)
>

Great work and results, thanks!

Acked-by: Andrii Nakryiko <andrii@xxxxxxxxxx>

[...]





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux