On 12/17/22 7:02 AM, xiangxia.m.yue@xxxxxxxxx wrote:
From: Tonghao Zhang <xiangxia.m.yue@xxxxxxxxx> This testing show how to reproduce deadlock in special case. We update htab map in Task and NMI context. Task can be interrupted by NMI, if the same map bucket was locked, there will be a deadlock. * map max_entries is 2. * NMI using key 4 and Task context using key 20. * so same bucket index but map_locked index is different. The selftest use perf to produce the NMI and fentry nmi_handle. Note that bpf_overflow_handler checks bpf_prog_active, but in bpf update map syscall increase this counter in bpf_disable_instrumentation. Then fentry nmi_handle and update hash map will reproduce the issue. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@xxxxxxxxx> Cc: Alexei Starovoitov <ast@xxxxxxxxxx> Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx> Cc: Andrii Nakryiko <andrii@xxxxxxxxxx> Cc: Martin KaFai Lau <martin.lau@xxxxxxxxx> Cc: Song Liu <song@xxxxxxxxxx> Cc: Yonghong Song <yhs@xxxxxx> Cc: John Fastabend <john.fastabend@xxxxxxxxx> Cc: KP Singh <kpsingh@xxxxxxxxxx> Cc: Stanislav Fomichev <sdf@xxxxxxxxxx> Cc: Hao Luo <haoluo@xxxxxxxxxx> Cc: Jiri Olsa <jolsa@xxxxxxxxxx> Cc: Hou Tao <houtao1@xxxxxxxxxx>
Ack with a small nit below. Acked-by: Yonghong Song <yhs@xxxxxx>
--- tools/testing/selftests/bpf/DENYLIST.aarch64 | 1 + tools/testing/selftests/bpf/DENYLIST.s390x | 1 + .../selftests/bpf/prog_tests/htab_deadlock.c | 75 +++++++++++++++++++ .../selftests/bpf/progs/htab_deadlock.c | 30 ++++++++ 4 files changed, 107 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/htab_deadlock.c create mode 100644 tools/testing/selftests/bpf/progs/htab_deadlock.c
[...]
diff --git a/tools/testing/selftests/bpf/progs/htab_deadlock.c b/tools/testing/selftests/bpf/progs/htab_deadlock.c new file mode 100644 index 000000000000..72178f073667 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/htab_deadlock.c @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2022 DiDi Global Inc. */ +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> + +char _license[] SEC("license") = "GPL"; + +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(max_entries, 2); + __uint(map_flags, BPF_F_ZERO_SEED); + __type(key, unsigned int); + __type(value, unsigned int); +} htab SEC(".maps"); + +SEC("fentry/nmi_handle")
nmi_handle() is a static function. In my setup, it is not inlined. But if it is inlined, the test will succeed regardless of the previous fix. But currently we don't have mechanisms to discover such situations, so I am okay with the test. But it would be good if you can add a small comment to explain this caveat.
+int bpf_nmi_handle(struct pt_regs *regs) +{ + unsigned int val = 0, key = 4; + + bpf_map_update_elem(&htab, &key, &val, BPF_ANY); + return 0; +} + +SEC("perf_event") +int bpf_empty(struct pt_regs *regs) +{ + return 0; +}