On Sun, 28 Apr 2024 13:01:19 -0700 Linus Torvalds wrote: > On Sat, 27 Apr 2024 at 16:13, Hillf Danton <hdanton@xxxxxxxx> wrote: > > > > > -> #0 (&sighand->siglock){....}-{2:2}: > > > check_prev_add kernel/locking/lockdep.c:3134 [inline] > > > check_prevs_add kernel/locking/lockdep.c:3253 [inline] > > > validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869 > > > __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 > > > lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 > > > __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] > > > _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162 > > > force_sig_info_to_task+0x68/0x580 kernel/signal.c:1334 > > > force_sig_fault_to_task kernel/signal.c:1733 [inline] > > > force_sig_fault+0x12c/0x1d0 kernel/signal.c:1738 > > > __bad_area_nosemaphore+0x127/0x780 arch/x86/mm/fault.c:814 > > > handle_page_fault arch/x86/mm/fault.c:1505 [inline] > > > > Given page fault with runqueue locked, bpf makes trouble instead of > > helping anything in this case. > > That's not the odd thing here. > > Look, the callchain is: > > > > exc_page_fault+0x612/0x8e0 arch/x86/mm/fault.c:1563 > > > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 > > > rep_movs_alternative+0x22/0x70 arch/x86/lib/copy_user_64.S:48 > > > copy_user_generic arch/x86/include/asm/uaccess_64.h:110 [inline] > > > raw_copy_from_user arch/x86/include/asm/uaccess_64.h:125 [inline] > > > __copy_from_user_inatomic include/linux/uaccess.h:87 [inline] > > > copy_from_user_nofault+0xbc/0x150 mm/maccess.c:125 > > IOW, this is all doing a copy from user with page faults disabled, and > it shouldn't have caused a signal to be sent, so the whole > __bad_area_nosemaphore -> force_sig_fault path is bad. > So is game like copying from/putting to user with runqueue locked at the first place. Plus as per another syzbot report [1], bpf could make trouble with workqueue pool locked. [1] https://lore.kernel.org/lkml/00000000000051348606171f61a1@xxxxxxxxxx/