> On Jul 21, 2020, at 12:10 PM, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Thu, Jul 16, 2020 at 03:59:32PM -0700, Song Liu wrote: >> + >> +BPF_CALL_3(bpf_get_stackid_pe, struct bpf_perf_event_data_kern *, ctx, >> + struct bpf_map *, map, u64, flags) >> +{ >> + struct perf_event *event = ctx->event; >> + struct perf_callchain_entry *trace; >> + bool has_kernel, has_user; >> + bool kernel, user; >> + >> + /* perf_sample_data doesn't have callchain, use bpf_get_stackid */ >> + if (!(event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY)) > > what if event was not created with PERF_SAMPLE_CALLCHAIN ? > Calling the helper will still cause crashes, no? Yeah, it may still crash. Somehow I messed up this logic... > >> + return bpf_get_stackid((unsigned long)(ctx->regs), >> + (unsigned long) map, flags, 0, 0); >> + >> + if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK | >> + BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID))) >> + return -EINVAL; >> + >> + user = flags & BPF_F_USER_STACK; >> + kernel = !user; >> + >> + has_kernel = !event->attr.exclude_callchain_kernel; >> + has_user = !event->attr.exclude_callchain_user; >> + >> + if ((kernel && !has_kernel) || (user && !has_user)) >> + return -EINVAL; > > this will break existing users in a way that will be very hard for them to debug. > If they happen to set exclude_callchain_* flags during perf_event_open > the helpers will be failing at run-time. > One can argue that when precise_ip=1 the bpf_get_stack is broken, but > this is a change in behavior. > It also seems to be broken when PERF_SAMPLE_CALLCHAIN was not set at event > creation time, but precise_ip=1 was. > >> + >> + trace = ctx->data->callchain; >> + if (unlikely(!trace)) >> + return -EFAULT; >> + >> + if (has_kernel && has_user) { > > shouldn't it be || ? It should be &&. We only need to adjust the attached calltrace when it has both kernel and user stack. > >> + __u64 nr_kernel = count_kernel_ip(trace); >> + int ret; >> + >> + if (kernel) { >> + __u64 nr = trace->nr; >> + >> + trace->nr = nr_kernel; >> + ret = __bpf_get_stackid(map, trace, flags); >> + >> + /* restore nr */ >> + trace->nr = nr; >> + } else { /* user */ >> + u64 skip = flags & BPF_F_SKIP_FIELD_MASK; >> + >> + skip += nr_kernel; >> + if (skip > BPF_F_SKIP_FIELD_MASK) >> + return -EFAULT; >> + >> + flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip; >> + ret = __bpf_get_stackid(map, trace, flags); >> + } >> + return ret; >> + } >> + return __bpf_get_stackid(map, trace, flags); > ... >> + if (has_kernel && has_user) { >> + __u64 nr_kernel = count_kernel_ip(trace); >> + int ret; >> + >> + if (kernel) { >> + __u64 nr = trace->nr; >> + >> + trace->nr = nr_kernel; >> + ret = __bpf_get_stack(ctx->regs, NULL, trace, buf, >> + size, flags); >> + >> + /* restore nr */ >> + trace->nr = nr; >> + } else { /* user */ >> + u64 skip = flags & BPF_F_SKIP_FIELD_MASK; >> + >> + skip += nr_kernel; >> + if (skip > BPF_F_SKIP_FIELD_MASK) >> + goto clear; >> + >> + flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip; >> + ret = __bpf_get_stack(ctx->regs, NULL, trace, buf, >> + size, flags); >> + } > > Looks like copy-paste. I think there should be a way to make it > into common helper. I thought about moving this logic to a helper. But we are calling __bpf_get_stackid() above, and __bpf_get_stack() here. So we can't easily put all the logic in a big helper. Multiple small helpers looks messy (to me). > > I think the main isssue is wrong interaction with event attr flags. > I think the verifier should detect that bpf_get_stack/bpf_get_stackid > were used and prevent attaching to perf_event with attr.precise_ip=1 > and PERF_SAMPLE_CALLCHAIN is not specified. > I was thinking whether attaching bpf to event can force setting of > PERF_SAMPLE_CALLCHAIN, but that would be a surprising behavior, > so not a good idea. > So the only thing left is to reject attach when bpf_get_stack is used > in two cases: > if attr.precise_ip=1 and PERF_SAMPLE_CALLCHAIN is not set. > (since it will lead to crashes) We only need to block precise_ip >= 2. precise_ip == 1 is OK. > if attr.precise_ip=1 and PERF_SAMPLE_CALLCHAIN is set, > but exclude_callchain_[user|kernel]=1 is set too. > (since it will lead to surprising behavior of bpf_get_stack) > > Other ideas? Yes, this sounds good. Thanks, Song