Re: Debugging usb core/xhci issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 19-11-22 10:42:26, Bryan Gillespie wrote:
> Hello,
> 
> My name is Bryan Gillespie (RPGillespie6 on GitHub). I'm emailing
> because I'm completely stumped at how to approach debugging a
> USB-related issue on an embedded linux setup and I'm hoping someone
> here might be able to at least be able to give some high level ideas
> on how to approach debug. Also, I've never used mailing lists before
> so let me know if this is completely out of line.
> 
> Basically, I have a marvell a3700 soc running embedded linux (linux
> version 4.4) connected to a Qualcomm modem (linux version 3.18) via
> USB 3.0 traces on a PCB. The Qualcomm modem enumerates as a devices in
> the a3700 with 6 interfaces and 14 endpoints. There are various
> drivers that are applied to the usb interfaces, from qcserial to
> qmi_wwan, to adb (userspace), to ipcrtr (normally not a usb driver but
> has usb xprt added).
> 
> Everything seems to work perfectly fine until I start putting the
> system under higher load for longer periods of time. For example, if I
> run iperf traffic through the qmi_wwan/usbnet interface (20 MB up, 200
> MB down) and send control traffic periodically through ipc router
> interface, eventually (~1-3 hours) there is some kind of breakage and
> nothing usb-related works anymore for that device. Not even adb works
> even though it has its own dedicated interface (adb shell just hangs
> indefinitely, for example).
> 
> **This leads me to believe something in linux's usbcore or xhci
> somehow got foobared by an interface driver since those are the common
> layers shared by all usb interfaces?**
> 
> I don't understand these layers well enough to know what that could
> possibly be. I should also mention that sometimes (not always) there
> is a single dmesg trace that happens at the time of breakage in the
> a3700:
> 
> [ 3771.097658] ipcrtr_read_cb Connection Reset 7 urb status -71
> 
> ipcrtr_read_cb is the urb complete callback and -71 is the feared
> -EPROTO urb code.
> 

This usually the hardware error.

> If I issue USBDEVFS_RESET to the device with ioctl inside the a3700,
> everything starts magically working again (presumably because all the
> data structures/buffers/etc. in xhci and above are reset and all the
> interfaces are re-probed?). I am pretty sure (but not positive) it is
> not the modem's fault since qualcomm's provided reference processor
> seems to be able to run iperf traffic indefinitely.
> 
> I should mention that the a3700 processor is very limited on memory;
> it only has about 160 MB of total memory (DRAM) available to linux
> compared to Qualcomm's reference processor which has 4 GB memory (and
> is running linux version 3.10).
> 
> If you've made it this far in my email, my question is - how would you
> approach debugging this? Are there some key things you would check?
> Are there any known gotchas with linux 4.X as host and linux 3.X as
> device? It is not easily reproducible (at least not without waiting a
> long time - currently exploring if it is possibly to cause the issue
> faster somehow). I have ftrace enabled, but so far I haven't been able
> to get a trace that captures the exact window of breakage. I tried
> turning on all usb-related debug with dynamic debug as well, but this
> appears to cause the kernel to consume 100% cpu as soon as I start
> iperf so currently I'm trying to identify some key files to turn on
> traces for that hopefully won't overwhelm the cpu with logging.
> 

Hi Byran,

Your kernel for both host and device are too old. xHCI driver improves
a lot these years, Would you please try using newer kernel the hardware
supported to see if any thing changes? It is easier for the driver
maintainer to give some hints for newer kernel.

-- 

Thanks,
Peter Chen



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux