RE: USB: add check to detect host controller hardware removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> Sent: Tuesday, October 17, 2023 10:06 PM
> To: Li, Meng <Meng.Li@xxxxxxxxxxxxx>
> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>; Ingo Molnar
> <mingo@xxxxxxxxxx>; USB mailing list <linux-usb@xxxxxxxxxxxxxxx>;
> Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>; linux-rt-users <linux-rt-
> users@xxxxxxxxxxxxxxx>
> Subject: Re: USB: add check to detect host controller hardware removal
> 
> CAUTION: This email comes from a non Wind River email account!
> Do not click links or open attachments unless you recognize the sender and
> know the content is safe.
> 
> On Tue, Oct 17, 2023 at 02:23:05AM +0000, Li, Meng wrote:
> > I did some debugs on my side.
> > Firstly, the local_irq_disable_nort() function had been removed from latest
> RT kernel.
> 
> What's in the RT kernel doesn't matter here, because the code you're patching
> belongs to the vanilla kernel.
> 
> > Second, because of creating xhci-pci.c, the commit c548795abe0d("USB:
> add check to detect host controller hardware removal") is no longer useful.
> > Because the function usb_remove_hcd() is invoked from xhci_pci_remove()
> of file xhci-pci.c in advance.
> 
> What about for non-xHCI controllers?
> 

I will try non-xHCI controllers in later if I can find out one on my side.

> > I am trying to fix this issue with getting register status directly without
> local_irq_disable().
> 
> Were you able to locate the original bug report?
> 
This is original bug report
https://bugzilla.redhat.com/show_bug.cgi?id=579093

my latest debug information as below:
when I removed the PCIe-USB card, there is below exception calltrace when operating host controller register.
Internal error: synchronous external abort: 0000000096000210 [#1] PREEMPT_RT SMP
Modules linked in:
CPU: 3 PID: 329 Comm: usb-storage Tainted: G        W          6.1.53-rt10-yocto-preempt-rt #1
Hardware name: LS1043A RDB Board (DT)
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : xhci_ring_ep_doorbell+0x78/0x120
lr : xhci_queue_bulk_tx+0x3b0/0x8a4
sp : ffff80000b0e3960
x29: ffff80000b0e3960 x28: ffff000004ce2290 x27: ffff000008e71100
x26: ffff000005718a80 x25: 0000000000000421 x24: 000000000000001f
x23: ffff000008e71108 x22: 0000000000000000 x21: ffff8000099e5100
x20: 0000000000000002 x19: 0000000000000004 x18: 0000000000000000
x17: 0000000000000008 x16: ffff00007b5cfb00 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002
x11: 0000000000000001 x10: 0000000000000a40 x9 : ffff8000089b0b50
x8 : ffff0000057c9a00 x7 : 000000000000001f x6 : ffff0000056c8000
x5 : ffff800009714ca0 x4 : 0000000000000004 x3 : 0000000000000000
x2 : 0000000000000000 x1 : ffff8000099e5108 x0 : ffff000004ce2290
Call trace:
 xhci_ring_ep_doorbell+0x78/0x120
 xhci_queue_bulk_tx+0x3b0/0x8a4
 xhci_urb_enqueue+0x274/0x510
 usb_hcd_submit_urb+0xc0/0x8b0
 usb_submit_urb+0x29c/0x5c0
 usb_stor_msg_common+0x9c/0x190
 usb_stor_bulk_transfer_buf+0x58/0x110
 usb_stor_Bulk_transport+0xdc/0x380
 usb_stor_invoke_transport+0x40/0x530
 usb_stor_transparent_scsi_command+0x18/0x24
 usb_stor_control_thread+0x20c/0x2a0
 kthread+0x12c/0x130
 ret_from_fork+0x10/0x20
Code: 540001cc 8b140aa1 d5033ebf b9000033 (b9400021) 
---[ end trace 0000000000000000 ]---
Because of the exception, the xhci->lock in xhci_urb_enqueue is released normally.
In this situation, if remove the pcie device with below command
# echo 1 > /sys/bus/pci/devices/<PCIe ID>/remove
The code will hang at the xhci->lock in xhci_urb_dequeue() function.
Even if I refer to commit c548795abe0d, move usb_hcd_irq(0, hcd) to function xhci_pci_remove(),
there is also an exception calltrace("Internal error: synchronous external abort") when executing readl(&xhci->op_regs->status);
although the code doesn't hang, the exception causes that code is broken from xhci_pci_remove(), the flowing code is not executed.
Because usb_hcd_irq(0, hcd) causes exception, I think it should be removed. 
In additional, removing PCIe card suddenly is not a reasonable operation, it may destroy the hardware.
so I think we don't need to add usb_hcd_irq(0, hcd) on the logical path of unbinding pcie driver.

Thanks,
Limeng

> Alan Stern





[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux