答复: [PATCH] kdb: Fix the deadlock issue in KDB debugging.

Liuye <liu.yeC@xxxxxxx> · Fri, 1 Mar 2024 03:30:25 +0000

>On Wed, Feb 28, 2024 at 10:56:02AM +0800, LiuYe wrote:
>> master cpu : After executing the go command, a deadlock occurs.
>> slave cpu: may be performing thread migration,
>>         acquiring the running queue lock of master CPU.
>>         Then it was interrupted by kdb NMI and entered the nmi_handler process.
>>         (nmi_handle-> kgdb_nmicallback-> kgdb_cpu_enter
>>         while(1){ touch wathcdog}.)
>
>I think this description is a little short and doesn't clearly explain the cause. How about:
>
>Currently, if kgdboc includes 'kdb', then kgdboc will attempt to use schedule_work() to provoke a keyboard reset when transitioning out of the debugger and back to normal operation. This can cause deadlock because schedule_work() is not NMI-safe.
>
>The stack trace below shows an example of the problem. In this case the master cpu is not running from NMI but it has parked the slace CPUs using an NMI and the parked CPUs is holding spinlocks needed by schedule_work().
>
>

Due to the brevity of the description, there may be some misunderstanding, so a detailed description is provided as follows:

before KDB command “go”：

When a specific key is detected by the serial port, it will trigger kgdb_breakpoint, and the master CPU0 will enter the kdb_main_loop to process user commands in a loop.

kgdb_breakpoint
int3
do_int3
notify_die
atomic_notifier_call_chain
__atomic_notifier_call_chain
notifier_call_chain
kgdb_notify
__kgdb_notify
kgdb_handle_exception
kgdb_cpu_enter (kgdb_roundup_cpus send IPI to other slave CPU)
kdb_stub
kdb_main_loop

slave CPU1, CPU2, CPU3 ... and other CPUs:
Using CPU1 as an example:
Currently holding the running queue lock of the master CPU0 due to load_balance or other reasons, responding to the NMI sent by master CPU0 through kgdb_roundup_cpus. Enter the following stack:
nmi_handle
kgdb_nmicallback
kgdb_cpu_enter (The slave CPU1 will loop touch watchdog and wait for the master CPU0 to exit.)

The above is the state before executing the KDB command "go".

When the user executes the KDB command "go", it will trigger a deadlock.
master CPU0 :
kdb_main_loop return
kdb_stub return
kgdb_cpu_enter
kgdboc_post_exp_handler
queue_work_on
__queue_work
insert_work
wake_up_process
try_to_wake_up
_raw_spin_lock （Acquire the spin lock of master CPU0 rq->lock, but at this time the spin lock of master CPU0 is held by CPU1）

As a result, a deadlock has occurred.

Therefore, when the master CPU0 exits, if the rq->lock of CPU0 is locked, it should not wake up the worker on the system_wq.

>> example:
>>  BUG: spinlock lockup suspected on CPU#0, namex/10450
>>  lock: 0xffff881ffe823980, .magic: dead4ead, .owner: namexx/21888,
>> .owner_cpu: 1
>>  ffff881741d00000 ffff881741c01000 0000000000000000 0000000000000000
>>  ffff881740f58e78 ffff881741cffdd0 ffffffff8147a7fc ffff881740f58f20
>> Call Trace:
>>  [<ffffffff81479e6d>] ? __schedule+0x16d/0xac0  [<ffffffff8147a7fc>] ?
>> schedule+0x3c/0x90  [<ffffffff8147e71a>] ?
>> schedule_hrtimeout_range_clock+0x10a/0x120
>>  [<ffffffff8147d22e>] ? mutex_unlock+0xe/0x10  [<ffffffff811c839b>] ?
>> ep_scan_ready_list+0x1db/0x1e0  [<ffffffff8147e743>] ?
>> schedule_hrtimeout_range+0x13/0x20
>>  [<ffffffff811c864a>] ? ep_poll+0x27a/0x3b0  [<ffffffff8108c540>] ?
>> wake_up_q+0x70/0x70  [<ffffffff811c99a8>] ? SyS_epoll_wait+0xb8/0xd0
>> [<ffffffff8147f296>] ? entry_SYSCALL_64_fastpath+0x12/0x75
>>  CPU: 0 PID: 10450 Comm: namex Tainted: G           O    4.4.65 #1
>>  Hardware name: Insyde Purley/Type2 - Board Product Name1, BIOS 05.21.51.0036 07/19/2019
>>   0000000000000000 ffff881ffe813c10 ffffffff8124e883 ffff881741c01000
>>   ffff881ffe823980 ffff881ffe813c38 ffffffff810a7f7f ffff881ffe823980
>>   000000007d2b7cd0 0000000000000001 ffff881ffe813c68 ffffffff810a80e0
>>   Call Trace:
>>   <#DB>  [<ffffffff8124e883>] dump_stack+0x85/0xc2
>>   [<ffffffff810a7f7f>] spin_dump+0x7f/0x100
>>   [<ffffffff810a80e0>] do_raw_spin_lock+0xa0/0x150
>>   [<ffffffff8147eb55>] _raw_spin_lock+0x15/0x20
>>   [<ffffffff8108c256>] try_to_wake_up+0x176/0x3d0
>>   [<ffffffff8108c4c5>] wake_up_process+0x15/0x20
>>   [<ffffffff8107b371>] insert_work+0x81/0xc0
>>   [<ffffffff8107b4e5>] __queue_work+0x135/0x390
>>   [<ffffffff8107b786>] queue_work_on+0x46/0x90
>>   [<ffffffff81313d28>] kgdboc_post_exp_handler+0x48/0x70
>>   [<ffffffff810ed488>] kgdb_cpu_enter+0x598/0x610
>>   [<ffffffff810ed6e2>] kgdb_handle_exception+0xf2/0x1f0
>>   [<ffffffff81054e21>] __kgdb_notify+0x71/0xd0
>>   [<ffffffff81054eb5>] kgdb_notify+0x35/0x70
>>   [<ffffffff81082e6a>] notifier_call_chain+0x4a/0x70
>>   [<ffffffff8108304d>] notify_die+0x3d/0x50
>>   [<ffffffff81017219>] do_int3+0x89/0x120
>>   [<ffffffff81480fb4>] int3+0x44/0x80
>>
>> Signed-off-by: LiuYe <liu.yeC@xxxxxxx>
>> ---
>>  drivers/tty/serial/kgdboc.c | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c
>> index 7ce7bb164..945318ef1 100644
>> --- a/drivers/tty/serial/kgdboc.c
>> +++ b/drivers/tty/serial/kgdboc.c
>> @@ -22,6 +22,9 @@
>>  #include <linux/module.h>
>>  #include <linux/platform_device.h>
>>  #include <linux/serial_core.h>
>> +#include <linux/smp.h>
>> +
>> +#include "../kernel/sched/sched.h"
>>
>>  #define MAX_CONFIG_LEN         40
>>
>> @@ -399,7 +402,8 @@ static void kgdboc_post_exp_handler(void)
>>                 dbg_restore_graphics = 0;
>>                 con_debug_leave();
>>         }
>> -       kgdboc_restore_input();
>> +       if (!raw_spin_is_locked(&(cpu_rq(smp_processor_id())->lock)))
>> +               kgdboc_restore_input();
>
>I don't think solving this by access internal scheduler state is the right approach .
>
>The description I wrote above perhaps already suggests why. The deadlock occurs because it is unsafe to call schedule_work() from the debug trap handler. The debug trap handler in your stack trace is not running from an NMI but it certainly has NMI-like properties. Therefore a better fix is not to call schedule_work() at all from the debug trap handler.
>
>Instead we need to use an NMI-safe API such as irq_work_queue() and that irq_work can call schedule_work() and trigger the keyboard reset.
>
>
>Daniel.

-------------------------------------------------------------------------------------------------------------------------------------
本邮件及其附件含有新华三集团的保密信息，仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制、
或散发）本邮件中的信息。如果您错收了本邮件，请您立即电话或邮件通知发件人并删除本
邮件！
This e-mail and its attachments contain confidential information from New H3C, which is
intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
by phone or email immediately and delete it!