Re: USB transaction errors causing RCU stalls and kernel panics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 03, 2020 at 03:05:50PM +0000, Jonas Karlsson wrote:
> Hi,
> 
> We have a board with an NXP i.MX8 SoC. We are running Linux 4.19.35 from NXP on the SoC.
> 
> There is a modem connected to the SoC via USB through a USB hub. 
> The modem presents it self as a cdc-acm device with 4 tty:s.
> 
> Sometimes we end up in a situation where all transfers over USB generetes 'USB transaction Errors". 
> It is likely that the modem is misbehaving. When this happens we get a lot of "xhci-cdns3: ERROR unknown event type 37" 
> in the terminal indicating that the xhci event ring is full. This often leads to RCU stalls and sometimes Kernel panics.
> 
> If I enable dynamic debug on xhci_hcd and cdc-acm I can see that all transfers have error code -71 
> (-EPROTO which in xhci translates to 'USB transaction error"). When this happens it seems 
> like xhci resets the ep, sets TR Deq Ptr to unstall the ep and then a new transfer is started 
> which also fails. This behavior generates a lot of events on the event ring which causes 
> 'ERROR unknown event type 37'. This loop of failing transfers seems to continue until we either unbind
> the USB driver or get a kernel panic. The SoC almost becomes unresponsive since it spends most of the 
> time executing usb interrupts. 
> 
> If I pull the reset pin of the USB hub and keep it in reset state at this point, the event loop of failing 
> transfers continues despite there is nothing on the USB bus any longer. The only way to get out of 
> that loop is to either unbind the usb driver or power cycle the board.
> 
> Is this the expected behavior when USB transaction error happens for all transfers when using cdc-acm class driver?
> Or could there be something wrong in the low level USB driver (Cadence in our case)? We need to figure out why we 
> get all the transaction errors but we also need to make sure the kernel does not die on us when we have a misbehaving USB device. 
> Does anyone have a suggestion on what we could do to improve the stability of the kernel in this situation?

I would blame the xhci-cdns driver as it is the one controlling all of
this.

I don't see this driver in the 4.19 tree, so I think you are going to
have to get support from the company that provided you with that driver
as you are already paying for that support from them :)

good luck!

greg k-h



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux