Re: [Bug] [Deadlock] Kernel thread deadlock in rfcomm socket release when connect interrupted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Pete,


Thank you for your email with a reproducer.

Am 29.05.22 um 13:42 schrieb Peter Sutton:

Compile the attached C program (gcc -lbluetooth bug.c) and execute:

$ ./a.out

Interrupt (^C/SIGINT) during the connect. The process should hang and
the Bluetooth socket will now be in deadlock.

Kernel thread stack:

Google Mail’s compositor wraps lines after 72 characters, making it harder to read.

[May29 12:23] INFO: task krfcommd:902 blocked for more than 122 seconds.
[  +0.000009]       Tainted: P           OE     5.18.0-arch1-1 #1
[  +0.000004] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  +0.000002] task:krfcommd        state:D stack:    0 pid:  902 ppid:      2 flags:0x00004000
[  +0.000010] Call Trace:
[  +0.000003]  <TASK>
[  +0.000007]  __schedule+0x37c/0x11f0
[  +0.000013]  ? __schedule+0x384/0x11f0
[  +0.000012]  ? l2cap_chan_create+0x138/0x180 [bluetooth da0a812fd33c72f9c94149bd973bd9835fc8aa63]
[  +0.000104]  schedule+0x4f/0xb0
[  +0.000008]  schedule_preempt_disabled+0x15/0x20
[  +0.000009]  __mutex_lock.constprop.0+0x2d0/0x480
[  +0.000012]  rfcomm_run+0x152/0x1900 [rfcomm 70c711e71e4c70ddabda45ec756f02d9606ec257]
[  +0.000018]  ? ttwu_do_wakeup+0x17/0x160
[  +0.000011]  ? _raw_spin_rq_lock_irqsave+0x20/0x20
[  +0.000010]  ? rfcomm_check_accept+0xa0/0xa0 [rfcomm 70c711e71e4c70ddabda45ec756f02d9606ec257]
[  +0.000015]  kthread+0xde/0x110
[  +0.000011]  ? kthread_complete_and_exit+0x20/0x20
[  +0.000010]  ret_from_fork+0x22/0x30
[  +0.000012]  </TASK>

Task stack:

[  +0.000003] INFO: task a.out:1035 blocked for more than 122 seconds.
[  +0.000004]       Tainted: P           OE     5.18.0-arch1-1 #1
[  +0.000003] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  +0.000001] task:a.out           state:D stack:    0 pid: 1035 ppid:    817 flags:0x00004006
[  +0.000008] Call Trace:
[  +0.000002]  <TASK>
[  +0.000003]  __schedule+0x37c/0x11f0
[  +0.000009]  ? __mod_memcg_state+0x2f/0x70
[  +0.000008]  schedule+0x4f/0xb0
[  +0.000007]  __lock_sock+0x7d/0xc0
[  +0.000010]  ? cpuacct_percpu_seq_show+0x20/0x20
[  +0.000009]  lock_sock_nested+0x48/0x50
[  +0.000009]  rfcomm_sk_state_change+0x2b/0x120 [rfcomm 70c711e71e4c70ddabda45ec756f02d9606ec257]
[  +0.000018]  __rfcomm_dlc_close+0x99/0x210 [rfcomm 70c711e71e4c70ddabda45ec756f02d9606ec257]
[  +0.000015]  rfcomm_dlc_close+0x6e/0xb0 [rfcomm 70c711e71e4c70ddabda45ec756f02d9606ec257]
[  +0.000015]  __rfcomm_sock_close+0x2e/0xe0 [rfcomm 70c711e71e4c70ddabda45ec756f02d9606ec257]
[  +0.000017]  rfcomm_sock_shutdown+0x65/0xa0 [rfcomm 70c711e71e4c70ddabda45ec756f02d9606ec257]
[  +0.000016]  rfcomm_sock_release+0x32/0xb0 [rfcomm 70c711e71e4c70ddabda45ec756f02d9606ec257]
[  +0.000016]  __sock_release+0x3d/0xa0
[  +0.000010]  sock_close+0x15/0x20
[  +0.000009]  __fput+0x89/0x240
[  +0.000011]  task_work_run+0x60/0x90
[  +0.000007]  do_exit+0x337/0xac0
[  +0.000010]  ? del_timer_sync+0x73/0xb0
[  +0.000006]  do_group_exit+0x31/0xa0
[  +0.000009]  get_signal+0x986/0x990
[  +0.000007]  ? bt_sock_wait_state+0x124/0x1a0 [bluetooth da0a812fd33c72f9c94149bd973bd9835fc8aa63]
[  +0.000060]  ? wake_up_q+0x90/0x90
[  +0.000010]  arch_do_signal_or_restart+0x48/0x760
[  +0.000012]  exit_to_user_mode_prepare+0xd3/0x140
[  +0.000008]  syscall_exit_to_user_mode+0x26/0x50
[  +0.000006]  do_syscall_64+0x6b/0x90
[  +0.000009]  ? exc_page_fault+0x74/0x170
[  +0.000009]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  +0.000007] RIP: 0033:0x7f4ab4f13557
[  +0.000006] RSP: 002b:00007fff5b37cc38 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
[  +0.000007] RAX: fffffffffffffffc RBX: 00007fff5b37cd78 RCX: 00007f4ab4f13557
[  +0.000004] RDX: 000000000000000a RSI: 00007fff5b37cc4e RDI: 0000000000000003
[  +0.000004] RBP: 00007fff5b37cc60 R08: 0fffffffffffffff R09: 0000000000000000
[  +0.000003] R10: 00007f4ab4e075e0 R11: 0000000000000246 R12: 0000000000000000
[  +0.000003] R13: 00007fff5b37cd88 R14: 0000562da1cefde0 R15: 00007f4ab5214000
[  +0.000007]  </TASK>

Process stack:

[<0>] __lock_sock+0x7d/0xc0
[<0>] lock_sock_nested+0x48/0x50
[<0>] rfcomm_sk_state_change+0x2b/0x120 [rfcomm]
[<0>] __rfcomm_dlc_close+0x99/0x210 [rfcomm]
[<0>] rfcomm_dlc_close+0x6e/0xb0 [rfcomm]
[<0>] __rfcomm_sock_close+0x2e/0xe0 [rfcomm]
[<0>] rfcomm_sock_shutdown+0x65/0xa0 [rfcomm]
[<0>] rfcomm_sock_release+0x32/0xb0 [rfcomm]
[<0>] __sock_release+0x3d/0xa0
[<0>] sock_close+0x15/0x20
[<0>] __fput+0x89/0x240
[<0>] task_work_run+0x60/0x90
[<0>] do_exit+0x337/0xac0
[<0>] do_group_exit+0x31/0xa0
[<0>] get_signal+0x986/0x990
[<0>] arch_do_signal_or_restart+0x48/0x760
[<0>] exit_to_user_mode_prepare+0xd3/0x140
[<0>] syscall_exit_to_user_mode+0x26/0x50
[<0>] do_syscall_64+0x6b/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae

Replicated by Matt (CC'ed running 5.15.39) on different hardware and
Lloyd (CC'ed) on same hardware with same stack trace. Tested on
up-to-date Arch Linux (5.18.0).

What hardware is that?

Let me know if you need anything else.

As a lot of patches are also applied to the stable series, do you know, if this is a regression? Does it work with Linux 5.15(.0) or 5.10?


Kind regards,

Paul


--
Pete.

Only, if you care, the standard signature delimiter has a trailing space: `-- ` [1].


[1]: https://en.wikipedia.org/wiki/Signature_block#Standard_delimiter



[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux