Hi, On 1/4/2023 3:09 PM, Yonghong Song wrote: > > > On 1/2/23 6:40 PM, Tonghao Zhang wrote: >> a >> >> On Thu, Dec 29, 2022 at 2:29 PM Yonghong Song <yhs@xxxxxxxx> wrote: >>> >>> >>> >>> On 12/28/22 2:24 PM, Alexei Starovoitov wrote: >>>> On Tue, Dec 27, 2022 at 8:43 PM Yonghong Song <yhs@xxxxxxxx> wrote: >>>>> >>>>> >>>>> >>>>> On 12/18/22 8:15 PM, xiangxia.m.yue@xxxxxxxxx wrote: >>>>>> From: Tonghao Zhang <xiangxia.m.yue@xxxxxxxxx> >>>>>> >>>>>> This testing show how to reproduce deadlock in special case. >>>>>> We update htab map in Task and NMI context. Task can be interrupted by >>>>>> NMI, if the same map bucket was locked, there will be a deadlock. >>>>>> >>>>>> * map max_entries is 2. >>>>>> * NMI using key 4 and Task context using key 20. >>>>>> * so same bucket index but map_locked index is different. >>>>>> >>>>>> The selftest use perf to produce the NMI and fentry nmi_handle. >>>>>> Note that bpf_overflow_handler checks bpf_prog_active, but in bpf update >>>>>> map syscall increase this counter in bpf_disable_instrumentation. >>>>>> Then fentry nmi_handle and update hash map will reproduce the issue. SNIP >>>>>> diff --git a/tools/testing/selftests/bpf/progs/htab_deadlock.c >>>>>> b/tools/testing/selftests/bpf/progs/htab_deadlock.c >>>>>> new file mode 100644 >>>>>> index 000000000000..d394f95e97c3 >>>>>> --- /dev/null >>>>>> +++ b/tools/testing/selftests/bpf/progs/htab_deadlock.c >>>>>> @@ -0,0 +1,32 @@ >>>>>> +// SPDX-License-Identifier: GPL-2.0 >>>>>> +/* Copyright (c) 2022 DiDi Global Inc. */ >>>>>> +#include <linux/bpf.h> >>>>>> +#include <bpf/bpf_helpers.h> >>>>>> +#include <bpf/bpf_tracing.h> >>>>>> + >>>>>> +char _license[] SEC("license") = "GPL"; >>>>>> + >>>>>> +struct { >>>>>> + __uint(type, BPF_MAP_TYPE_HASH); >>>>>> + __uint(max_entries, 2); >>>>>> + __uint(map_flags, BPF_F_ZERO_SEED); >>>>>> + __type(key, unsigned int); >>>>>> + __type(value, unsigned int); >>>>>> +} htab SEC(".maps"); >>>>>> + >>>>>> +/* nmi_handle on x86 platform. If changing keyword >>>>>> + * "static" to "inline", this prog load failed. */ >>>>>> +SEC("fentry/nmi_handle") >>>>> >>>>> The above comment is not what I mean. In arch/x86/kernel/nmi.c, >>>>> we have >>>>> static int nmi_handle(unsigned int type, struct pt_regs *regs) >>>>> { >>>>> ... >>>>> } >>>>> ... >>>>> static noinstr void default_do_nmi(struct pt_regs *regs) >>>>> { >>>>> ... >>>>> handled = nmi_handle(NMI_LOCAL, regs); >>>>> ... >>>>> } >>>>> >>>>> Since nmi_handle is a static function, it is possible that >>>>> the function might be inlined in default_do_nmi by the >>>>> compiler. If this happens, fentry/nmi_handle will not >>>>> be triggered and the test will pass. >>>>> >>>>> So I suggest to change the comment to >>>>> nmi_handle() is a static function and might be >>>>> inlined into its caller. If this happens, the >>>>> test can still pass without previous kernel fix. >>>> >>>> It's worse than this. >>>> fentry is buggy. >>>> We shouldn't allow attaching fentry to: >>>> NOKPROBE_SYMBOL(nmi_handle); >>> >>> Okay, I see. Looks we should prevent fentry from >>> attaching any NOKPROBE_SYMBOL functions. >>> >>> BTW, I think fentry/nmi_handle can be replaced with >>> tracepoint nmi/nmi_handler. it is more reliable >> The tracepoint will not reproduce the deadlock(we have discussed v2). >> If it's not easy to complete a test for this case, should we drop this >> testcase patch? or fentry the nmi_handle and update the comments. > > could we use a softirq perf event (timer), e.g., > > struct perf_event_attr attr = { > .sample_period = 1, > .type = PERF_TYPE_SOFTWARE, > .config = PERF_COUNT_SW_CPU_CLOCK, > }; > > then you can attach function hrtimer_run_softirq (not tested) or > similar functions? The context will be a hard-irq context, right ? Because htab_lock_bucket() has already disabled hard-irq on current CPU, so the dead-lock will be impossible. > > I suspect most (if not all) functions in nmi path cannot > be kprobe'd. It seems that perf_event_nmi_handler() is also nokprobe function. However I think we could try its callees (e.g., x86_pmu_handle_irq or perf_event_overflow). > >>> and won't be impacted by potential NOKPROBE_SYMBOL >>> issues. >> >> >> > .