On Thu, Jan 26, 2023 at 3:41 AM Ilya Leoshkevich <iii@xxxxxxxxxxxxx> wrote: > > On Wed, 2023-01-25 at 16:26 -0800, Andrii Nakryiko wrote: > > On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > > wrote: > > > > > > Loading programs that use bpf_usdt_arg() on s390x fails with: > > > > > > ; switch (arg_spec->arg_type) { > > > 139: (61) r1 = *(u32 *)(r2 +8) > > > R2 unbounded memory access, make sure to bounds check any such > > > access > > > > can you show a bit longer log? we shouldn't just use > > bpf_probe_read_kernel for this. I suspect strategically placed > > barrier_var() calls will solve this. This is usually an issue with > > compiler reordering operations and doing actual check after it > > already > > speculatively adjusted pointer (which is technically safe and ok if > > we > > never deref that pointer, but verifier doesn't recognize such > > pattern) > > The full log is here: > > https://gist.github.com/iii-i/b6149ee99b37078ec920ab1d3bb45134 > > The relevant part seems to be: > > ; if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt) > 128: (79) r1 = *(u64 *)(r10 -24) ; frame1: > R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 > 129: (25) if r1 > 0xb goto pc+83 ; frame1: > R1_w=scalar(umax=11,var_off=(0x0; 0xf)) > ; if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt) > 130: (69) r1 = *(u16 *)(r8 +200) ; frame1: > R1_w=scalar(umax=65535,var_off=(0x0; 0xffff)) > R8_w=map_value(off=0,ks=4,vs=208,imm=0) > 131: (67) r1 <<= 48 ; frame1: > R1_w=scalar(smax=9223090561878065152,umax=18446462598732840960,var_off= > (0x0; 0xffff000000000000),s32_min=0,s32_max=0,u32_max=0) > 132: (c7) r1 s>>= 48 ; frame1: R1_w=scalar(smin=- > 32768,smax=32767) > ; if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt) > 133: (79) r2 = *(u64 *)(r10 -24) ; frame1: > R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 > 134: (bd) if r1 <= r2 goto pc+78 ; frame1: R1=scalar(smin=- > 32768,smax=32767) R2=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) > ; arg_spec = &spec->args[arg_num]; > 135: (79) r1 = *(u64 *)(r10 -24) ; frame1: > R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 > 136: (67) r1 <<= 4 ; frame1: > R1_w=scalar(umax=68719476720,var_off=(0x0; > 0xffffffff0),s32_max=2147483632,u32_max=-16) > 137: (bf) r2 = r8 ; frame1: > R2_w=map_value(off=0,ks=4,vs=208,imm=0) > R8=map_value(off=0,ks=4,vs=208,imm=0) > 138: (0f) r2 += r1 ; frame1: > R1_w=scalar(umax=68719476720,var_off=(0x0; > 0xffffffff0),s32_max=2147483632,u32_max=-16) > R2_w=map_value(off=0,ks=4,vs=208,umax=68719476720,var_off=(0x0; > 0xffffffff0),s32_max=2147483632,u32_max=-16) > ; switch (arg_spec->arg_type) { > 139: (61) r1 = *(u32 *)(r2 +8) > > #128-#129 make sure that *(u64 *)(r10 -24) <= 11, but when #133 > loads it again, this constraint is not there. I guess we need to > force flushing r1 to stack? The following helps: > > --- a/tools/lib/bpf/usdt.bpf.h > +++ b/tools/lib/bpf/usdt.bpf.h > @@ -130,7 +130,10 @@ int bpf_usdt_arg(struct pt_regs *ctx, __u64 > arg_num, long *res) > if (!spec) > return -ESRCH; > > - if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec- > >arg_cnt) > + if (arg_num >= BPF_USDT_MAX_ARG_CNT) > + return -ENOENT; > + barrier_var(arg_num); > + if (arg_num >= spec->arg_cnt) > return -ENOENT; > > arg_spec = &spec->args[arg_num]; > > I can use this in v2 if it looks good. arg_num -> spec->arg_cnt is "real" check, arg_num >= BPF_USDT_MAX_ARG_CNT is more to satisfy verifier (we know that spec->arg_cnt won't be >= BPF_USDT_MAX_ARG_CNT). Let's swap two checks in order and keep BPF_USDT_MAX_ARG_CNT close to spec->args[arg_num] use? And if barrier_var() is necessary, then so be it. > > > > Btw, I looked at the barrier_var() definition: > > #define barrier_var(var) asm volatile("" : "=r"(var) : "0"(var)) > > and I'm curious why it's not defined like this: > > #define barrier_var(var) asm volatile("" : "+r"(var)) > > which is a bit simpler? > > no reason, just unfamiliarity with embedded asm back then, we can update it we they are equivalent