> > One more data point. I'm seeing this 100% of the time when trying the > suspend my system (on 5.10): > > [ 466.608970] BUG: sleeping function called from invalid context at > net/core/sock.c:3074 > [ 466.608975] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: > 5614, name: kworker/u4:4 > [ 466.608980] CPU: 1 PID: 5614 Comm: kworker/u4:4 Tainted: G W > 5.10.43 #64 > [ 466.608983] Hardware name: HP Grunt/Grunt, BIOS > Google_Grunt.11031.104.0 09/05/2019 > [ 466.608991] Workqueue: events_unbound async_run_entry_fn > [ 466.608995] Call Trace: > [ 466.609003] dump_stack+0x9c/0xe7 > [ 466.609009] ___might_sleep+0x148/0x15e > [ 466.609013] lock_sock_nested+0x22/0x5d > [ 466.609033] hci_sock_dev_event+0x15a/0x1f0 [bluetooth] > [ 466.609043] hci_unregister_dev+0x15c/0x303 [bluetooth] > [ 466.609049] btusb_disconnect+0x77/0x127 [btusb] > [ 466.609054] usb_unbind_interface+0xa6/0x22e > [ 466.609059] ? usb_dev_suspend+0x14/0x14 > [ 466.609063] device_release_driver_internal+0x100/0x1a1 > [ 466.609067] unbind_marked_interfaces+0x4b/0x66 > [ 466.609071] usb_resume+0x59/0x66 > [ 466.609075] dpm_run_callback+0x8c/0x126 > [ 466.609078] device_resume+0x1f1/0x25b > [ 466.609082] async_resume+0x1d/0x42 > [ 466.609085] async_run_entry_fn+0x3d/0xd1 > [ 466.609089] process_one_work+0x1b9/0x363 > [ 466.609093] worker_thread+0x213/0x372 > [ 466.609097] kthread+0x150/0x15f > [ 466.609100] ? pr_cont_work+0x58/0x58 > [ 466.609103] ? kthread_blkcg+0x31/0x31 > [ 466.609106] ret_from_fork+0x22/0x30 > Oh my god, I didn't turn the CONFIG_DEBUG_ATOMIC_SLEEP on as you did when testing this patch. I was puzzled at that time why my userfaultfd process can keep the lock and totally stuck the device removal routine without any kernel WARNING. My bad, it seems that this patch is not a very good one. I can also get following logs when executing the POC code. [ 8.234583] BUG: sleeping function called from invalid context at net/core/sock.c:3048 [ 8.235336] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 125, name: exp [ 8.236038] CPU: 0 PID: 125 Comm: exp Not tainted 5.11.11+ #13 [ 8.236542] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 8.237330] Call Trace: [ 8.237605] dump_stack+0x1b9/0x22e [ 8.237946] ? log_buf_vmcoreinfo_setup+0x45d/0x45d [ 8.238453] ? tty_ldisc_hangup+0x4d7/0x6d0 [ 8.238912] ? show_regs_print_info+0x12/0x12 [ 8.239383] ? task_work_run+0x16c/0x210 [ 8.239807] ? syscall_exit_to_user_mode+0x20/0x40 [ 8.240324] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 8.240897] ? _raw_spin_lock+0xa1/0x170 [ 8.241326] ___might_sleep+0x32d/0x420 [ 8.241749] ? stack_trace_snprint+0xe0/0xe0 [ 8.242204] ? __might_sleep+0x100/0x100 [ 8.242636] ? deactivate_slab+0x1ca/0x560 [ 8.243080] lock_sock_nested+0x96/0x360 [ 8.243523] ? hci_sock_dev_event+0xfe/0x5b0 [ 8.244007] ? sock_def_destruct+0x10/0x10 [ 8.244372] ? kasan_set_free_info+0x1f/0x40 [ 8.244738] ? kmem_cache_free+0xca/0x220 [ 8.245093] hci_sock_dev_event+0x2fa/0x5b0 [ 8.245454] hci_unregister_dev+0x3fa/0x1700 [ 8.245820] ? rcu_sync_exit+0xe0/0x1e0 [ 8.246149] hci_uart_tty_close+0x19f/0x220 [ 8.246511] ? hci_uart_tty_open+0x2d0/0x2d0 [ 8.246878] tty_ldisc_hangup+0x4d7/0x6d0 [ 8.247224] __tty_hangup+0x6c2/0x980 [ 8.247543] ? pty_close+0x382/0x460 [ 8.247852] ? pty_open+0x280/0x280 [ 8.248153] tty_release+0x408/0x10f0 [ 8.248469] ? rcu_read_unlock_strict+0x10/0x10 [ 8.248863] ? tty_release_struct+0xd0/0xd0 [ 8.249222] __fput+0x342/0x7b0 [ 8.249498] task_work_run+0x16c/0x210 [ 8.249821] exit_to_user_mode_prepare+0xeb/0x110 [ 8.250223] syscall_exit_to_user_mode+0x20/0x40 [ 8.250618] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 8.251047] RIP: 0033:0x7f171e4c1beb As Anand has already pointed out, the code read_lock(&hci_sk_list.lock) is not going to allow the sleep of lock_sock(sk) --- a/net/bluetooth/hci_sock.c +++ b/net/bluetooth/hci_sock.c @@ -755,7 +755,7 @@ void hci_sock_dev_event(struct hci_dev * /* Detach sockets from device */ read_lock(&hci_sk_list.lock); sk_for_each(sk, &hci_sk_list.head) { - bh_lock_sock_nested(sk); + lock_sock(sk); if (hci_pi(sk)->hdev == hdev) { hci_pi(sk)->hdev = NULL; sk->sk_err = EPIPE; @@ -764,7 +764,7 @@ void hci_sock_dev_event(struct hci_dev * hci_dev_put(hdev); } - bh_unlock_sock(sk); + release_sock(sk); } read_unlock(&hci_sk_list.lock); } The original bug details is already presented: https://www.openwall.com/lists/oss-security/2021/06/08/2 In short, the hci_sock_dev_event() function is supposed to wait for other bound ioctl functions (like hci_sock_bound_ioctl) to leave before releasing the hdev using hci_dev_put(hdev). I replace the lock from bh_lock_sock_nested to lock_sock() for this. However, it seems that this patch breaks the rule and we have to figure out a better one. T^T (I just hope this patch won't introduce any security impacts but just this warning BUG, at least it will help with the previous UAF one) My direct idea is to replace the hci_sk_list.lock to another sleep-able lock too. Or we have to craft the logic to allow the HCI_DEV_UNREG event to signal other functions to abandon the lock. I'm going to working on this, and hope to get some suggestions just like before. And Greg, really sorry to submit this not properly tested patch. Please pardon me for this unintended mistake. :( Regards Lin Ma