On Thu, Aug 18, 2022 at 1:44 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Wed, Aug 17, 2022 at 09:56:23PM -0700, Yonghong Song wrote: > > > > > > On 8/15/22 3:44 PM, Alexei Starovoitov wrote: > > > On Thu, Aug 11, 2022 at 10:24 PM Yonghong Song <yhs@xxxxxx> wrote: > > > > > > > > In C, struct value can be passed as a function argument. > > > > For small structs, struct value may be passed in > > > > one or more registers. For trampoline based bpf programs, > > > > This would cause complication since one-to-one mapping between > > > > function argument and arch argument register is not valid > > > > any more. > > > > > > > > To support struct value argument and make bpf programs > > > > easy to write, the bpf program function parameter is > > > > changed from struct type to a pointer to struct type. > > > > The following is a simplified example. > > > > > > > > In one of later selftests, we have a bpf_testmod function: > > > > struct bpf_testmod_struct_arg_2 { > > > > long a; > > > > long b; > > > > }; > > > > noinline int > > > > bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) { > > > > bpf_testmod_test_struct_arg_result = a + b.a + b.b + c; > > > > return bpf_testmod_test_struct_arg_result; > > > > } > > > > > > > > When a bpf program is attached to the bpf_testmod function > > > > bpf_testmod_test_struct_arg_2(), the bpf program may look like > > > > SEC("fentry/bpf_testmod_test_struct_arg_2") > > > > int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c) > > > > { > > > > t2_a = a; > > > > t2_b_a = b->a; > > > > t2_b_b = b->b; > > > > t2_c = c; > > > > return 0; > > > > } > > > > > > > > Basically struct value becomes a pointer to the struct. > > > > The trampoline stack will be increased to store the stack values and > > > > the pointer to these values will be saved in the stack slot corresponding > > > > to that argument. For x86_64, the struct size is limited up to 16 bytes > > > > so the struct can fit in one or two registers. The struct size of more > > > > than 16 bytes is not supported now as our current use case is > > > > for sockptr_t in the argument. We could handle such large struct's > > > > in the future if we have concrete use cases. > > > > > > > > The main changes are in save_regs() and restore_regs(). The following > > > > illustrated the trampoline asm codes for save_regs() and restore_regs(). > > > > save_regs(): > > > > /* first argument */ > > > > mov DWORD PTR [rbp-0x18],edi > > > > /* second argument: struct, save actual values and put the pointer to the slot */ > > > > lea rax,[rbp-0x40] > > > > mov QWORD PTR [rbp-0x10],rax > > > > mov QWORD PTR [rbp-0x40],rsi > > > > mov QWORD PTR [rbp-0x38],rdx > > > > /* third argument */ > > > > mov DWORD PTR [rbp-0x8],esi > > > > restore_regs(): > > > > mov edi,DWORD PTR [rbp-0x18] > > > > mov rsi,QWORD PTR [rbp-0x40] > > > > mov rdx,QWORD PTR [rbp-0x38] > > > > mov esi,DWORD PTR [rbp-0x8] > > > > > > Not sure whether it was discussed before, but > > > why cannot we adjust the bpf side instead? > > > Technically struct passing between bpf progs was never > > > officially supported. llvm generates something. > > > Probably always passes by reference, but we can adjust > > > that behavior without breaking any programs because > > > we don't have bpf libraries. Programs are fully contained > > > in one or few files. libbpf can do static linking, but > > > without any actual libraries the chance of breaking > > > backward compat is close to zero. > > > > Agree. At this point, we don't need to worry about > > compatibility between bpf program and bpf program libraries. > > > > > Can we teach llvm to pass sizeof(struct) <= 16 in > > > two bpf registers? > > > > Yes, we can. I just hacked llvm and was able to > > do that. > > > > > Then we wouldn't need to have a discrepancy between > > > kernel function prototype and bpf fentry prog proto. > > > Both will have struct by value in the same spot. > > > The trampoline generation will be simpler for x86 and > > > its runtime faster too. > > > > I tested x86 and arm64 both supports two registers > > for a 16 byte struct. > > > > > The other architectures that pass small structs by reference > > > can do a bit more work in the trampoline: copy up to 16 byte > > > and bpf prog side will see it as they were passed in 'registers'. > > > wdyt? > > > > I know systemz and Hexagon will pass by reference for any > > struct size >= 8 bytes. Didn't complete check others. > > > > But since x86 and arm64 supports direct value passing > > with two registers, we should be okay. As you mentioned, > > we could support systemz/hexagon style of struct passing > > by copying the values to the stack. > > > > > > But I have a problem how to define a user friendly > > macro like BPF_PROG for user to use. > > > > Let us say, we have a program like below: > > SEC("fentry/bpf_testmod_test_struct_arg_1") > > int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 *a, int b, > > int c) { > > ... > > } > > > > We have BPF_PROG macro definition here: > > > > #define BPF_PROG(name, args...) \ > > name(unsigned long long *ctx); \ > > static __always_inline typeof(name(0)) \ > > ____##name(unsigned long long *ctx, ##args); \ > > typeof(name(0)) name(unsigned long long *ctx) \ > > { \ > > _Pragma("GCC diagnostic push") \ > > _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \ > > return ____##name(___bpf_ctx_cast(args)); \ > > _Pragma("GCC diagnostic pop") \ > > } \ > > static __always_inline typeof(name(0)) \ > > ____##name(unsigned long long *ctx, ##args) > > > > Some we have static function definition > > > > int ____test_struct_arg_1(unsigned long long *ctx, struct > > bpf_testmod_struct_arg_2 *a, int b, int c); > > > > But the function call inside the function test_struct_arg_1() > > is > > ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2]); > > > > We have two problems here: > > ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2]) > > does not match the static function declaration. > > This is not problem if everything is int/ptr type. > > If one of argument is structure type, we will have > > type conversion problem. Let us this can be resolved > > somehow through some hack. > > > > More importantly, because some structure may take two > > registers, > > ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2]) > > may not be correct. In my above example, if the > > structure size is 16 bytes, > > then the actual call should be > > ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2], ctx[3]) > > So we need to provide how many extra registers are needed > > beyond ##args in the macro. I have not tried how to > > resolve this but this extra information in the macro > > definite is not user friendly. > > > > Not sure what is the best way to handle this issue (##args is not precise > > and needs addition registers for >8 struct arguments). > > The kernel is using this trick to cast 8 byte structs to u64: > /* cast any integer, pointer, or small struct to u64 */ > #define UINTTYPE(size) \ > __typeof__(__builtin_choose_expr(size == 1, (u8)1, \ > __builtin_choose_expr(size == 2, (u16)2, \ > __builtin_choose_expr(size == 4, (u32)3, \ > __builtin_choose_expr(size == 8, (u64)4, \ > (void)5))))) > #define __CAST_TO_U64(x) ({ \ > typeof(x) __src = (x); \ > UINTTYPE(sizeof(x)) __dst; \ > memcpy(&__dst, &__src, sizeof(__dst)); \ > (u64)__dst; }) > > casting 16 byte struct to two u64 can be similar. > Ideally we would declare bpf prog as: > SEC("fentry/bpf_testmod_test_struct_arg_1") > int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 a, int b, int c) { > note there is no '*'. It's struct by value. Agree. So I tried to compile this: $ git diff diff --git a/tools/testing/selftests/bpf/progs/test_vmlinux.c b/tools/testing/selftests/bpf/progs/test_vmlinux.c index e9dfa0313d1b..dccb9ae9801f 100644 --- a/tools/testing/selftests/bpf/progs/test_vmlinux.c +++ b/tools/testing/selftests/bpf/progs/test_vmlinux.c @@ -15,6 +15,16 @@ bool tp_btf_called = false; bool kprobe_called = false; bool fentry_called = false; +typedef struct { + void *x; + int t; +} sockptr; + +static int blah(sockptr x) +{ + return x.t; +} + SEC("tp/syscalls/sys_enter_nanosleep") int handle__tp(struct trace_event_raw_sys_enter *args) { @@ -30,7 +40,14 @@ int handle__tp(struct trace_event_raw_sys_enter *args) return 0; tp_called = true; - return 0; + + return blah(({ union { + struct { u64 x, y; } z; + sockptr s; + } tmp = {.z = {0, 1}}; + + tmp.s; + })); } SEC("raw_tp/sys_enter") And it compiled. So I think it's possible to do u64 to struct conversion using this approach. We'd have to define two variations of macro -- one for structs <= 8 bytes, another for structs > 8 and <= 16 bytes. One will "consume" single ctx[] slot, another -- will consume both. And then each variant knows which other macro to refer to after itself. A bit of macro hackery, but it should work. > The main challenge is how to do the math in the BPF_PROG macros. > Currently it's doing: > #define ___bpf_ctx_cast1(x) ___bpf_ctx_cast0(), (void *)ctx[0] > #define ___bpf_ctx_cast2(x, args...) ___bpf_ctx_cast1(args), (void *)ctx[1] > #define ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), (void *)ctx[2] > #define ___bpf_ctx_cast4(x, args...) ___bpf_ctx_cast3(args), (void *)ctx[3] > > The ctx[index] is one-to-one with argument position. > That 'index' needs to be calculated. > Maybe something like UINTTYPE() that applies to previous arguments? > #define REG_CNT(arg) \ > __builtin_choose_expr(sizeof(arg) == 1, 1, \ > __builtin_choose_expr(sizeof(arg) == 2, 1, \ > __builtin_choose_expr(sizeof(arg) == 4, 1, \ > __builtin_choose_expr(sizeof(arg) == 8, 1, \ > __builtin_choose_expr(sizeof(arg) == 16, 2, \ > (void)0))))) > > #define ___bpf_reg_cnt0() 0 > #define ___bpf_reg_cnt1(x) ___bpf_reg_cnt0() + REG_CNT(x) > #define ___bpf_reg_cnt2(x, args...) ___bpf_reg_cnt1(args) + REG_CNT(x) > #define ___bpf_reg_cnt(args...) ___bpf_apply(___bpf_reg_cnt, ___bpf_narg(args))(args) > > This way the macro will calculate the index inside ctx[] array. > > and then inside ___bpf_ctx_castN macro use ___bpf_reg_cnt. > Instead of: > ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), (void *)ctx[2] > it will be > ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), \ > __builtin_choose_expr(sizeof(x) <= 8, (void *)ctx[___bpf_reg_cnt(args)], > *(typeof(x) *) &ctx[___bpf_reg_cnt(args)]) > > x - is one of the arguments. > args - all args before 'x'. Doing __bpf_reg_cnt on them should calculate the index. > *(typeof(x) *)& should type cast to struct of 16 bytes. > > Rough idea, of course. > > Another alternative is instead of: > #define BPF_PROG(name, args...) > name(unsigned long long *ctx); > do: > #define BPF_PROG(name, args...) > struct XX { > macro inserts all 'args' here separated by ; so it becomes a proper struct > }; > name(struct XX *ctx); > > and then instead of doing ___bpf_ctx_castN for each argument > do single cast of all of 'u64 ctx[N]' passed from fentry into 'struct XX *'. > The problem with this approach that small args like char, short, int needs to > be declared in struct XX with __align__(8). > > Both approaches may be workable?