> On Sep 23, 2020, at 7:26 PM, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > Hi Song, > > the latest bpf-next has 100% reproducible splat: > test_progs -t stacktrace_build_id_nmi After a couple hours of building kernel and rebooting (this cannot run on qemu...), I realized I cannot repro this reliably. :( It happened once on one kernel, but not anymore, event on the same kernel. >From the trace, it looks like an A-A deadlock. But I think it is a false alert. > [ 18.984806] > [ 18.984807] ================================ > [ 18.984807] WARNING: inconsistent lock state > [ 18.984808] 5.9.0-rc6-01771-g1466de1330e1 #2967 Not tainted > [ 18.984809] -------------------------------- > [ 18.984809] inconsistent {INITIAL USE} -> {IN-NMI} usage. > [ 18.984810] test_progs/1990 [HC2[2]:SC0[0]:HE0:SE1] takes: > [ 18.984810] ffffe8ffffc219c0 (&head->lock){....}-{2:2}, at: > __pcpu_freelist_pop+0xe3/0x180 > [ 18.984813] {INITIAL USE} state was registered at: > [ 18.984814] lock_acquire+0x175/0x7c0 > [ 18.984814] _raw_spin_lock+0x2c/0x40 > [ 18.984815] __pcpu_freelist_pop+0xe3/0x180 > [ 18.984815] pcpu_freelist_pop+0x31/0x40 > [ 18.984816] htab_map_alloc+0xbbf/0xf40 Here, we access head->lock in htab_map_alloc. > [ 18.984816] __do_sys_bpf+0x5aa/0x3ed0 > [ 18.984817] do_syscall_64+0x2d/0x40 > [ 18.984818] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 18.984818] irq event stamp: 12 > [ 18.984819] hardirqs last enabled at (11): [<ffffffff816953d4>] > get_page_from_freelist+0x1314/0x6190 > [ 18.984820] hardirqs last disabled at (12): [<ffffffff837e527d>] > irqentry_enter+0x1d/0x50 > [ 18.984821] softirqs last enabled at (0): [<ffffffff8111011c>] > copy_process+0x147c/0x5c10 > [ 18.984821] softirqs last disabled at (0): [<0000000000000000>] 0x0 > [ 18.984822] > [ 18.984822] other info that might help us debug this: > [ 18.984823] Possible unsafe locking scenario: > [ 18.984823] > [ 18.984824] CPU0 > [ 18.984824] ---- > [ 18.984824] lock(&head->lock); > [ 18.984826] <Interrupt> > [ 18.984826] lock(&head->lock); > [ 18.984827] > [ 18.984828] *** DEADLOCK *** > [ 18.984828] > [ 18.984829] 2 locks held by test_progs/1990: > [ 18.984829] #0: ffff8881f52958e8 (&mm->mmap_lock#2){++++}-{3:3}, > at: do_user_addr_fault+0x1cd/0x821 > [ 18.984832] #1: ffff8881f6e39e20 (&cpuctx_lock){-...}-{2:2}, at: > perf_event_task_tick+0x12b/0xc90 > [ 18.984835] > [ 18.984835] stack backtrace: > [ 18.984836] CPU: 0 PID: 1990 Comm: test_progs Not tainted > 5.9.0-rc6-01771-g1466de1330e1 #2967 > [ 18.984837] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS 1.11.0-2.el7 04/01/2014 > [ 18.984837] Call Trace: > [ 18.984838] <NMI> > [ 18.984838] dump_stack+0x9a/0xd0 > [ 18.984839] lock_acquire+0x5c9/0x7c0 > [ 18.984839] ? lock_release+0x6f0/0x6f0 > [ 18.984840] ? __pcpu_freelist_pop+0xe3/0x180 > [ 18.984840] _raw_spin_lock+0x2c/0x40 > [ 18.984841] ? __pcpu_freelist_pop+0xe3/0x180 > [ 18.984841] __pcpu_freelist_pop+0xe3/0x180 > [ 18.984842] pcpu_freelist_pop+0x17/0x40 > [ 18.984842] ? lock_release+0x6f0/0x6f0 > [ 18.984843] __bpf_get_stackid+0x534/0xaf0 Then in NMI, we do __bpf_get_stackid(). So the deadlock happens when __bpf_get_stackid() is called when NMI interrupts htab_map_alloc() of the same map. This should never happen, because we only use __bpf_get_stackid() after the map is allocated. Without a reliable repro, it will be really hard for me to pin point the commit that introduced this. I guess I will need your .config and/or test machine for further debugging. Thanks, Song [...]