This is a note to let you know that I've just added the patch titled perf trace: Fix BPF loading failure (-E2BIG) to the 6.13-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: perf-trace-fix-bpf-loading-failure-e2big.patch and it can be found in the queue-6.13 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit e7b0823078aaf27e59f8c5ccc0b2b13abafee722 Author: Howard Chu <howardchu95@xxxxxxxxx> Date: Thu Dec 12 18:30:47 2024 -0800 perf trace: Fix BPF loading failure (-E2BIG) [ Upstream commit 013eb043f37bd87c4d60d51034401a5a6d105bcf ] As reported by Namhyung Kim and acknowledged by Qiao Zhao (link: https://lore.kernel.org/linux-perf-users/20241206001436.1947528-1-namhyung@xxxxxxxxxx/), on certain machines, perf trace failed to load the BPF program into the kernel. The verifier runs perf trace's BPF program for up to 1 million instructions, returning an E2BIG error, whereas the perf trace BPF program should be much less complex than that. This patch aims to fix the issue described above. The E2BIG problem from clang-15 to clang-16 is cause by this line: } else if (size < 0 && size >= -6) { /* buffer */ Specifically this check: size < 0. seems like clang generates a cool optimization to this sign check that breaks things. Making 'size' s64, and use } else if ((int)size < 0 && size >= -6) { /* buffer */ Solves the problem. This is some Hogwarts magic. And the unbounded access of clang-12 and clang-14 (clang-13 works this time) is fixed by making variable 'aug_size' s64. As for this: -if (aug_size > TRACE_AUG_MAX_BUF) - aug_size = TRACE_AUG_MAX_BUF; +aug_size = args->args[index] > TRACE_AUG_MAX_BUF ? TRACE_AUG_MAX_BUF : args->args[index]; This makes the BPF skel generated by clang-18 work. Yes, new clangs introduce problems too. Sorry, I only know that it works, but I don't know how it works. I'm not an expert in the BPF verifier. I really hope this is not a kernel version issue, as that would make the test case (kernel_nr) * (clang_nr), a true horror story. I will test it on more kernel versions in the future. Fixes: 395d38419f18: ("perf trace augmented_raw_syscalls: Add more check s to pass the verifier") Reported-by: Namhyung Kim <namhyung@xxxxxxxxxx> Signed-off-by: Howard Chu <howardchu95@xxxxxxxxx> Tested-by: Namhyung Kim <namhyung@xxxxxxxxxx> Link: https://lore.kernel.org/r/20241213023047.541218-1-howardchu95@xxxxxxxxx Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c index 4a62ed593e84e..e4352881e3faa 100644 --- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c +++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c @@ -431,9 +431,9 @@ static bool pid_filter__has(struct pids_filtered *pids, pid_t pid) static int augment_sys_enter(void *ctx, struct syscall_enter_args *args) { bool augmented, do_output = false; - int zero = 0, size, aug_size, index, - value_size = sizeof(struct augmented_arg) - offsetof(struct augmented_arg, value); + int zero = 0, index, value_size = sizeof(struct augmented_arg) - offsetof(struct augmented_arg, value); u64 output = 0; /* has to be u64, otherwise it won't pass the verifier */ + s64 aug_size, size; unsigned int nr, *beauty_map; struct beauty_payload_enter *payload; void *arg, *payload_offset; @@ -484,14 +484,11 @@ static int augment_sys_enter(void *ctx, struct syscall_enter_args *args) } else if (size > 0 && size <= value_size) { /* struct */ if (!bpf_probe_read_user(((struct augmented_arg *)payload_offset)->value, size, arg)) augmented = true; - } else if (size < 0 && size >= -6) { /* buffer */ + } else if ((int)size < 0 && size >= -6) { /* buffer */ index = -(size + 1); barrier_var(index); // Prevent clang (noticed with v18) from removing the &= 7 trick. index &= 7; // Satisfy the bounds checking with the verifier in some kernels. - aug_size = args->args[index]; - - if (aug_size > TRACE_AUG_MAX_BUF) - aug_size = TRACE_AUG_MAX_BUF; + aug_size = args->args[index] > TRACE_AUG_MAX_BUF ? TRACE_AUG_MAX_BUF : args->args[index]; if (aug_size > 0) { if (!bpf_probe_read_user(((struct augmented_arg *)payload_offset)->value, aug_size, arg))