[PATCH 0/2] Optimize the return_instance management of uretprobe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



While exploring uretprobe syscall and trampoline for ARM64, we observed
a slight performance gain for Redis benchmark using uretprobe syscall.
This patchset aims to further improve the performance of uretprobe by
optimizing the management of struct return_instance data.

In details, uretprobe utilizes dynamically allocated memory for struct
return_instance data. These data track the call chain of instrumented
functions. This approach is not efficient, especially considering the
inherent locality of function invocation.

This patchset proposes a rework of the return_instances management. It
replaces dynamic memory allocation with a statically allocated array.
This approach leverages the stack-style usage of return_instance and
remove the need for kamlloc/kfree operations.

This patch has been tested on Kunpeng916 (Hi1616), 4 NUMA nodes, 64
cores @ 2.4GHz. Redis benchmarks show a throughput gain by 2% for Redis
GET and SET commands:

------------------------------------------------------------------
Test case       | No uretprobes | uretprobes     | uretprobes
                |               | (current)      | (optimized)
==================================================================
Redis SET (RPS) | 47025         | 40619 (-13.6%) | 41529 (-11.6%)
------------------------------------------------------------------
Redis GET (RPS) | 46715         | 41426 (-11.3%) | 42306 (-9.4%)
------------------------------------------------------------------

Liao Chang (2):
  uprobes: Optimize the return_instance related routines
  selftests/bpf: Add uretprobe test for return_instance management

 include/linux/uprobes.h                       |  10 +-
 kernel/events/uprobes.c                       | 162 +++++++++++-------
 .../bpf/prog_tests/uretprobe_depth.c          | 150 ++++++++++++++++
 .../selftests/bpf/progs/uretprobe_depth.c     |  19 ++
 4 files changed, 274 insertions(+), 67 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/uretprobe_depth.c
 create mode 100644 tools/testing/selftests/bpf/progs/uretprobe_depth.c

-- 
2.34.1





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux