Re: [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 29, 2024 at 2:48 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> This RFC seeks to allow using no_caller_saved_registers gcc/clang
> attribute with some BPF helper functions (and kfuncs in the future).
>
> As documented in [1], this attribute means that function scratches
> only some of the caller saved registers defined by ABI.
> For BPF the set of such registers could be defined as follows:
> - R0 is scratched only if function is non-void;
> - R1-R5 are scratched only if corresponding parameter type is defined
>   in the function prototype.
>
> The goal of the RFC is to implement no_caller_saved_registers
> (nocsr for short) in a backwards compatible manner:
> - for kernels that support the feature, gain some performance boost
>   from better register allocation;
> - for kernels that don't support the feature, allow programs execution
>   with minor performance losses.
>
> To achieve this, use a scheme suggested by Alexei Starovoitov:
> - for nocsr calls clang allocates registers as-if relevant r0-r5
>   registers are not scratched by the call;
> - as a post-processing step, clang visits each nocsr call and adds
>   spill/fill for every live r0-r5;
> - stack offsets used for spills/fills are allocated as minimal
>   stack offsets in whole function and are not used for any other
>   purposes;
> - when kernel loads a program, it looks for such patterns
>   (nocsr function surrounded by spills/fills) and checks if
>   spill/fill stack offsets are used exclusively in nocsr patterns;
> - if so, and if current JIT inlines the call to the nocsr function

JIT inlines or BPF verifier can inline as well?


>   (e.g. a helper call), kernel removes unnecessary spill/fill pairs;
> - when old kernel loads a program, presence of spill/fill pairs
>   keeps BPF program valid, albeit slightly less efficient.
>
> Corresponding clang/llvm changes are available in [2].
>
> The patch-set uses bpf_get_smp_processor_id() function as a canary,
> making it the first helper with nocsr attribute.
>
> For example, consider the following program:
>
>   #define __no_csr __attribute__((no_caller_saved_registers))
>   #define SEC(name) __attribute__((section(name), used))
>   #define bpf_printk(fmt, ...) bpf_trace_printk((fmt), sizeof(fmt), __VA_ARGS__)
>
>   typedef unsigned int __u32;
>
>   static long (* const bpf_trace_printk)(const char *fmt, __u32 fmt_size, ...) = (void *) 6;
>   static __u32 (*const bpf_get_smp_processor_id)(void) __no_csr = (void *)8;
>
>   SEC("raw_tp")
>   int test(void *ctx)
>   {
>           __u32 task = bpf_get_smp_processor_id();
>         bpf_printk("ctx=%p, smp=%d", ctx, task);
>         return 0;
>   }
>
>   char _license[] SEC("license") = "GPL";
>
> Compiled (using [2]) as follows:
>
>   $ clang --target=bpf -O2 -g -c -o nocsr.bpf.o nocsr.bpf.c
>   $ llvm-objdump --no-show-raw-insn -Sd nocsr.bpf.o
>     ...
>   3rd parameter for printk call     removable spill/fill pair
>   .--- 0:       r3 = r1                             |
> ; |       __u32 task = bpf_get_smp_processor_id();  |
>   |    1:       *(u64 *)(r10 - 0x8) = r3 <----------|
>   |    2:       call 0x8                            |
>   |    3:       r3 = *(u64 *)(r10 - 0x8) <----------'
> ; |     bpf_printk("ctx=%p, smp=%d", ctx, task);
>   |    4:       r1 = 0x0 ll
>   |    6:       r2 = 0xf
>   |    7:       r4 = r0
>   '--> 8:       call 0x6
> ;       return 0;
>        9:       r0 = 0x0
>       10:       exit
>
> Here is how the program looks after verifier processing:
>
>   # bpftool prog load ./nocsr.bpf.o /sys/fs/bpf/nocsr-test
>   # bpftool prog dump xlated pinned /sys/fs/bpf/nocsr-test
>   int test(void * ctx):
>   ; int test(void *ctx)
>      0: (bf) r3 = r1               <--------- 3rd printk parameter
>   ; __u32 task = bpf_get_smp_processor_id();
>      1: (b4) w0 = 197132           <--------- inlined helper call,
>      2: (bf) r0 = r0               <--------- spill/fill pair removed
>      3: (61) r0 = *(u32 *)(r0 +0)  <---------
>   ; bpf_printk("ctx=%p, smp=%d", ctx, task);
>      4: (18) r1 = map[id:13][0]+0
>      6: (b7) r2 = 15
>      7: (bf) r4 = r0
>      8: (85) call bpf_trace_printk#-125920
>   ; return 0;
>      9: (b7) r0 = 0
>     10: (95) exit
>
> [1] https://clang.llvm.org/docs/AttributeReference.html#no-caller-saved-registers
> [2] https://github.com/eddyz87/llvm-project/tree/bpf-no-caller-saved-registers
>
> Eduard Zingerman (8):
>   bpf: add a get_helper_proto() utility function
>   bpf: no_caller_saved_registers attribute for helper calls
>   bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
>   selftests/bpf: extract utility function for BPF disassembly
>   selftests/bpf: no need to track next_match_pos in struct test_loader
>   selftests/bpf: extract test_loader->expect_msgs as a data structure
>   selftests/bpf: allow checking xlated programs in verifier_* tests
>   selftests/bpf: test no_caller_saved_registers spill/fill removal
>
>  include/linux/bpf.h                           |   6 +
>  include/linux/bpf_verifier.h                  |   9 +
>  kernel/bpf/helpers.c                          |   1 +
>  kernel/bpf/verifier.c                         | 346 +++++++++++++-
>  tools/testing/selftests/bpf/Makefile          |   1 +
>  tools/testing/selftests/bpf/disasm_helpers.c  |  50 ++
>  tools/testing/selftests/bpf/disasm_helpers.h  |  12 +
>  .../selftests/bpf/prog_tests/ctx_rewrite.c    |  71 +--
>  .../selftests/bpf/prog_tests/verifier.c       |   7 +
>  tools/testing/selftests/bpf/progs/bpf_misc.h  |   6 +
>  .../selftests/bpf/progs/verifier_nocsr.c      | 437 ++++++++++++++++++
>  tools/testing/selftests/bpf/test_loader.c     | 170 +++++--
>  tools/testing/selftests/bpf/test_progs.h      |   1 -
>  tools/testing/selftests/bpf/testing_helpers.c |   1 +
>  14 files changed, 986 insertions(+), 132 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/disasm_helpers.c
>  create mode 100644 tools/testing/selftests/bpf/disasm_helpers.h
>  create mode 100644 tools/testing/selftests/bpf/progs/verifier_nocsr.c
>
> --
> 2.45.2
>





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux