Re: [RFC] A couple of issues on BPF callstack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Sat, Mar 5, 2022 at 4:28 PM Yonghong Song <yhs@xxxxxx> wrote:
> On 3/4/22 3:28 PM, Namhyung Kim wrote:
> > More important thing to me is the content of the (perf) callchain.  If
> > the event has __PERF_SAMPLE_CALLCHAIN_EARLY, it will have context info
> > like PERF_CONTEXT_KERNEL.  So user might or might not see it depending
> > on whether the perf_event set with precise_ip and SAMPLE_CALLCHAIN.
> > This doesn't look good.
>
> Patch 7b04d6d60fcf ("bpf: Separate bpf_get_[stack|stackid] for
> perf events BPF") tried to fix __PERF_SAMPLE_CALLCHAIN_EARLY issue
> for bpf_get_stack[id]() helpers.

Right.

> The helpers will check whether event->attr.sample_type has
> __PERF_SAMPLE_CALLCHAIN_EARLY encoded or not, based on which
> the stacks will be retrieved accordingly.
> Did you any issue here?

It changes stack trace results by adding perf contexts like
PERF_CONTEXT_KERNEL and PERF_CONTEXT_USER.
Without __PERF_SAMPLE_CALLCHAIN_EARLY, I don't see those.

> >
> > After all, I think it'd be really great if we can skip those
> > uninteresting info easily.  Maybe we could add a flag to skip BPF code
>
> We cannot just skip those callchains with __PERF_SAMPLE_CALLCHAIN_EARLY.
> There are real use cases for it.

I'm not saying that I want to skip all the callchains.
What I want is a way to avoid those perf context info
in the callchains so that I can make sure to have the
same stack traces in a known code path regardless
of the event attribute and cpu vendors - as far as I know
__PERF_SAMPLE_CALLCHAIN_EARLY is enabled on Intel cpus only.

>
> > perf context, and even some scheduler code from the trace respectively
> > like in stack_trace_consume_entry_nosched().
>
> A flag for the bpf_get_stack[id]() helpers? It is possible. It would be
> great if you can detail your use case here and how a flag could help
> you.

Yep, something like BPF_F_SKIP_BPF_STACK.

In my case, I collect a callchain in a tracepoint to find its caller.
And I want to have a short call stack depth for a performance reason.
But the every 3 or 4 entries are already filled by BPF code and
I want to skip them.  I know that I can set it with skip mask but
having a hard coded value can be annoying since it might be
changed by different compilers, kernel version or configurations.

Similarly, I think it'd be useful to skip some scheduler functions
like __schedule when collecting stack traces in sched_switch.

Thanks,
Namhyung



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux