RE: [PATCH] usb: cdnsp: Fix deadlock issue in cdnsp_thread_irq_handler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>
>On 21-05-24 10:56:23, Pawel Laszczak wrote:
>> >
>> >
>> >On 21-05-20 11:42:24, Pawel Laszczak wrote:
>> >> From: Pawel Laszczak <pawell@xxxxxxxxxxx>
>> >>
>> >> Patch fixes the critical issue caused by deadlock which has been detected
>> >> during testing NCM class.
>> >>
>> >> The root cause of issue is spin_lock/spin_unlock instruction instead
>> >> spin_lock_irqsave/spin_lock_irqrestore in cdnsp_thread_irq_handler
>> >> function.
>> >
>> >Pawel, would you please explain more about why the deadlock occurs with
>> >current code, and why this patch could fix it?
>> >
>>
>> I'm going to add such description to commit message:
>>
>> Lack of spin_lock_irqsave causes that during handling threaded
>> interrupt inside spin_lock/spin_unlock section the ethernet
>> subsystem invokes eth_start_xmit function from interrupt context
>> which in turn starts queueing free requests in cdnsp driver.
>> Consequently, the deadlock occurs inside cdnsp_gadget_ep_queue
>> on spin_lock_irqsave instruction. Replacing spin_lock/spin_unlock
>> with spin_lock_irqsave/spin_lock_irqrestor causes disableing
>
>%s/disableing/disabling
>
>> interrupts and blocks queuing requests by ethernet subsystem until
>> cdnsp_thread_irq_handler ends..
>>
>> I hope this description is sufficient.
>
>A call stack graph may be better, like [1]
>
>[1]: https://urldefense.com/v3/__https://www.spinics.net/lists/linux-
>usb/msg211931.html__;!!EHscmS1ygiU1lA!WCxrsHL4OWKd3iPdyymp-dQlVCoiHM9QzjIZPUC6-Dm6cnpU5CRLLOHrgiYSRA$

After putting online extra cpus, I've manage to catch call trace showing the deadlock issue:

[  223.051673] smp: csd: Detected non-responsive CSD lock (#1) on CPU#0, waiting 5000000015 ns for CPU#02 do_sync_core+0x0/0x30(0x0).
[  223.051690] smp:     csd: CSD lock (#1) unresponsive.
[  223.051693] Sending NMI from CPU 0 to CPUs 2:
[  223.052690] NMI backtrace for cpu 2
[  223.052692] CPU: 2 PID: 3053 Comm: irq/16-0000:01: Tainted: G           OE     5.11.0-rc1+ #5
[  223.052693] Hardware name: ASUS All Series/Q87T, BIOS 0908 07/22/2014
[  223.052693] RIP: 0010:native_queued_spin_lock_slowpath+0x61/0x1d0
[  223.052695] Code: 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 47 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 8b 07 <84> c0 75 f8 b8 01 00 00 00 5d 66 89 07 c3 8b 37 b8 00 02 00 00 81
[  223.052696] RSP: 0018:ffffbc494011cde0 EFLAGS: 00000002
[  223.052698] RAX: 0000000000000101 RBX: ffff9ee8116b4a68 RCX: 0000000000000000
[  223.052699] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9ee8116b4658
[  223.052700] RBP: ffffbc494011cde0 R08: 0000000000000001 R09: 0000000000000000
[  223.052701] R10: ffff9ee8116b4670 R11: 0000000000000000 R12: ffff9ee8116b4658
[  223.052702] R13: ffff9ee8116b4670 R14: 0000000000000246 R15: ffff9ee8116b4658
[  223.052702] FS:  0000000000000000(0000) GS:ffff9ee89b400000(0000) knlGS:0000000000000000
[  223.052703] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  223.052704] CR2: 00007f7bcc41a830 CR3: 000000007a612003 CR4: 00000000001706e0
[  223.052705] Call Trace:
[  223.052706]  <IRQ>
[  223.052706]  do_raw_spin_lock+0xc0/0xd0
[  223.052707]  _raw_spin_lock_irqsave+0x95/0xa0
[  223.052708]  cdnsp_gadget_ep_queue.cold+0x88/0x107 [cdnsp_udc_pci]
[  223.052708]  usb_ep_queue+0x35/0x110
[  223.052709]  eth_start_xmit+0x220/0x3d0 [u_ether]
[  223.052710]  ncm_tx_timeout+0x34/0x40 [usb_f_ncm]
[  223.052711]  ? ncm_free_inst+0x50/0x50 [usb_f_ncm]
[  223.052711]  __hrtimer_run_queues+0xac/0x440
[  223.052712]  hrtimer_run_softirq+0x8c/0xb0
[  223.052713]  __do_softirq+0xcf/0x428
[  223.052713]  asm_call_irq_on_stack+0x12/0x20
[  223.052714]  </IRQ>
[  223.052715]  do_softirq_own_stack+0x61/0x70
[  223.052715]  irq_exit_rcu+0xc1/0xd0
[  223.052716]  sysvec_apic_timer_interrupt+0x52/0xb0
[  223.052717]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[  223.052718] RIP: 0010:do_raw_spin_trylock+0x18/0x40
[  223.052719] Code: ff ff 66 90 eb 86 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 55 8b 07 48 89 e5 85 c0 75 29 ba 01 00 00 00 f0 0f b1 17 <75> 1e 65 8b 05 2f 77 ee 70 89 47 08 5d 65 48 8b 04 25 40 7e 01 00
[  223.052720] RSP: 0018:ffffbc494138bda8 EFLAGS: 00000246
[  223.052721] RAX: 0000000000000000 RBX: ffff9ee8116b4658 RCX: 0000000000000000
[  223.052722] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9ee8116b4658
[  223.052723] RBP: ffffbc494138bda8 R08: 0000000000000001 R09: 0000000000000000
[  223.052724] R10: ffff9ee8116b4670 R11: 0000000000000000 R12: ffff9ee8116b4658
[  223.052725] R13: ffff9ee8116b4670 R14: ffff9ee7b5c73d80 R15: ffff9ee8116b4000
[  223.052726]  _raw_spin_lock+0x3d/0x70
[  223.052726]  ? cdnsp_thread_irq_handler.cold+0x32/0x112c [cdnsp_udc_pci]
[  223.052727]  cdnsp_thread_irq_handler.cold+0x32/0x112c [cdnsp_udc_pci]
[  223.052728]  ? cdnsp_remove_request+0x1f0/0x1f0 [cdnsp_udc_pci]
[  223.052729]  ? cdnsp_thread_irq_handler+0x5/0xa0 [cdnsp_udc_pci]
[  223.052730]  ? irq_thread+0xa0/0x1c0
[  223.052730]  irq_thread_fn+0x28/0x60
[  223.052731]  irq_thread+0x105/0x1c0
[  223.052732]  ? __kthread_parkme+0x42/0x90
[  223.052732]  ? irq_forced_thread_fn+0x90/0x90
[  223.052733]  ? wake_threads_waitq+0x30/0x30
[  223.052734]  ? irq_thread_check_affinity+0xe0/0xe0
[  223.052735]  kthread+0x12a/0x160
[  223.052735]  ? kthread_park+0x90/0x90
[  223.052736]  ret_from_fork+0x22/0x30

Pawel

>
>Peter
>
>>
>> Thanks,
>> Pawel
>>
>> >>
>> >> Cc: stable@xxxxxxxxxxxxxxx
>> >> Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
>> >> Signed-off-by: Pawel Laszczak <pawell@xxxxxxxxxxx>
>> >> ---
>> >>  drivers/usb/cdns3/cdnsp-ring.c | 7 ++++---
>> >>  1 file changed, 4 insertions(+), 3 deletions(-)
>> >>
>> >> diff --git a/drivers/usb/cdns3/cdnsp-ring.c b/drivers/usb/cdns3/cdnsp-ring.c
>> >> index 5f0513c96c04..68972746e363 100644
>> >> --- a/drivers/usb/cdns3/cdnsp-ring.c
>> >> +++ b/drivers/usb/cdns3/cdnsp-ring.c
>> >> @@ -1517,13 +1517,14 @@ irqreturn_t cdnsp_thread_irq_handler(int irq, void *data)
>> >>  {
>> >>  	struct cdnsp_device *pdev = (struct cdnsp_device *)data;
>> >>  	union cdnsp_trb *event_ring_deq;
>> >> +	unsigned long flags;
>> >>  	int counter = 0;
>> >>
>> >> -	spin_lock(&pdev->lock);
>> >> +	spin_lock_irqsave(&pdev->lock, flags);
>> >>
>> >>  	if (pdev->cdnsp_state & (CDNSP_STATE_HALTED | CDNSP_STATE_DYING)) {
>> >>  		cdnsp_died(pdev);
>> >> -		spin_unlock(&pdev->lock);
>> >> +		spin_unlock_irqrestore(&pdev->lock, flags);
>> >>  		return IRQ_HANDLED;
>> >>  	}
>> >>
>> >> @@ -1539,7 +1540,7 @@ irqreturn_t cdnsp_thread_irq_handler(int irq, void *data)
>> >>
>> >>  	cdnsp_update_erst_dequeue(pdev, event_ring_deq, 1);
>> >>
>> >> -	spin_unlock(&pdev->lock);
>> >> +	spin_unlock_irqrestore(&pdev->lock, flags);
>> >>
>> >>  	return IRQ_HANDLED;
>> >>  }
>> >> --
>> >> 2.25.1
>> >>
>> >
>> >--
>> >
>> >Thanks,
>> >Peter Chen
>>
>
>--
>
>Thanks,
>Peter Chen





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux