Hi Jiri, On Wed, May 29, 2024 at 1:31 AM Jiri Olsa <olsajiri@xxxxxxxxx> wrote: > > On Tue, May 28, 2024 at 11:53:11PM -0700, Namhyung Kim wrote: > > It was reported that accessing perf_event map entry caused pretty high > > LLC misses in get_map_perf_counter(). As reading perf_event is allowed > > for the local CPU only, I think we can use the target CPU of the event > > as hint for the allocation like in perf_event_alloc() so that the event > > and the entry can be in the same node at least. > > looks good, is there any profile to prove the gain? No, at this point. I'm not sure if it'd help LLC hit ratio but I think it should improve the memory latency. Thanks, Namhyung > > > > > Reported-by: Aleksei Shchekotikhin <alekseis@xxxxxxxxxx> > > Reported-by: Nilay Vaish <nilayvaish@xxxxxxxxxx> > > Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx> > > > --- > > v2) fix build errors > > > > kernel/bpf/arraymap.c | 11 +++++++++-- > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c > > index feabc0193852..067f7cf27042 100644 > > --- a/kernel/bpf/arraymap.c > > +++ b/kernel/bpf/arraymap.c > > @@ -1194,10 +1194,17 @@ static struct bpf_event_entry *bpf_event_entry_gen(struct file *perf_file, > > struct file *map_file) > > { > > struct bpf_event_entry *ee; > > + struct perf_event *event = perf_file->private_data; > > + int node = -1; > > > > - ee = kzalloc(sizeof(*ee), GFP_KERNEL); > > +#ifdef CONFIG_PERF_EVENTS > > + if (event->cpu >= 0) > > + node = cpu_to_node(event->cpu); > > +#endif > > + > > + ee = kzalloc_node(sizeof(*ee), GFP_KERNEL, node); > > if (ee) { > > - ee->event = perf_file->private_data; > > + ee->event = event; > > ee->perf_file = perf_file; > > ee->map_file = map_file; > > } > > -- > > 2.45.1.288.g0e0cd299f1-goog > >