On Thu, Dec 22, 2022 at 12:16 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Thu, Dec 22, 2022 at 09:34:42AM -0800, Namhyung Kim wrote: > > > Sorry about that. Let me rephrase it like below: > > > > With bpf_cast_to_kern_ctx(), BPF programs attached to a perf event > > can access perf sample data directly from the ctx. > > This is the bpf_prog_run() in bpf_overflow_handler(), right? Yes. > > > But the perf sample > > data is not fully prepared at this point, and some fields can have invalid > > uninitialized values. So it needs to call perf_prepare_sample() before > > calling the BPF overflow handler. > > It never was, why is it a problem now? BPF used to allow selected fields only like period and addr, and they are initialized always by perf_sample_data_init(). This is relaxed by the bpf_cast_to_kern_ctx() and it can easily access arbitrary fields of perf_sample_data now. The background of this change is to use BPF as a filter for perf event samples. The code is there already and returning 0 from BPF can drop perf samples. With access to more sample data, it'd make more educated decisions. For example, I got some requests to limit perf samples in a selected region of address (code or data). Or it can collect samples only if some hardware specific information is set in the raw data like in AMD IBS. We can easily extend it to other sample info based on users' needs. > > > But just calling perf_prepare_sample() can be costly when the BPF > > So you potentially call it twice now, how's that useful? Right. I think we can check data->sample_flags in perf_prepare_sample() to minimize the duplicate work. It already does it for some fields, but misses others. Thanks, Namhyung