It was reported that accessing perf_event map entry caused pretty high LLC misses in get_map_perf_counter(). As reading perf_event is allowed for the local CPU only, I think we can use the target CPU of the event as hint for the allocation like in perf_event_alloc() so that the event and the entry can be in the same node at least. Reported-by: Aleksei Shchekotikhin <alekseis@xxxxxxxxxx> Reported-by: Nilay Vaish <nilayvaish@xxxxxxxxxx> Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx> --- kernel/bpf/arraymap.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index feabc0193852..3f7718c261d7 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -1194,10 +1194,15 @@ static struct bpf_event_entry *bpf_event_entry_gen(struct file *perf_file, struct file *map_file) { struct bpf_event_entry *ee; + struct perf_event *event = perf_file->private_data; + int node = -1; - ee = kzalloc(sizeof(*ee), GFP_KERNEL); + if (event->cpu >= 0) + node = cpu_to_node(cpu); + + ee = kzalloc_node(sizeof(*ee), GFP_KERNEL, node); if (ee) { - ee->event = perf_file->private_data; + ee->event = event; ee->perf_file = perf_file; ee->map_file = map_file; } -- 2.45.1.288.g0e0cd299f1-goog