Re: usb: dwc3: HC dies under high I/O load on Exynos5422

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I've done a few more tests. I'm also adding the required information
described in DWC3 documentation which I previously missed.

> The OS is then unable to recover (I have rootfs on that SSD too) and
> the board must be manually restarted.

I have resolved this by creating a fresh Ubuntu 20.04 rootfs on an SD
card. The system now survives the controller crash. The xHC can also be
brought up again by unbinding the dwc3 driver and then binding it back.

> Dmesg contains the following output:
> < ... >

It turns out that this was not the full relevant output. I was
collecting the logs from a serial console and I haven't properly
enabled verbose printing. Hopefully the full dmesg is now linked below.

> The crash is happening when the USB-SATA bridge is controlled by the
> uas driver. I have not tested the usb-storage driver yet.

I tested this now. With usb-storage the controller is stable, but the
achievable throughput is lower (75 MB/s BOT vs 300 MB/s UAS).

---

With the rootfs on the SD card, I was able to capture a DWC3 event
trace & register dump. I am running clean 6.4-rc6 with a config similar
to multi_v7_defconfig (see below for details).

To capture the trace, I followed these steps:

 1. Unbind the DWC3 driver from the controller (12000000.usb).
 2. Enable DWC3 tracing.
 3. Bind the DWC3 driver back.
 4. Save the DWC3 register dump to "regdump-before-fio.txt".
 5. Run the FIO stress test from the first email. Once FIO stops
    printing IOPS, dump registers again to "regdump-during-freeze.txt"
 6. Once FIO exits and the kernel prints the "HC died" message,
    dump registers once more to "regdump-after-hc-died.txt".
 7. Save the current trace buffer to "trace.txt".
 8. Save the current kernel log to "dmesg.txt".

I had to do the DWC3 unbind-bind dance because I have no way of
unplugging the onboard JMS578 bridge from the main Exynos chip.

The resulting files can be found in the attached tarball including the
kernel config (I kept ARM_EXYNOS_BUS_DEVFREQ enabled this time).
Dmesg.txt is also available at https://pastebin.com/EkfXKMih .

I am not 100% sure this is not a hardware fault. However, there are a
few Exynos5422-based Odroid users experiencing a similar issue. Most of
them mention kernel 5.4, which does contain the bisected bad commit.
 - https://forum.odroid.com/viewtopic.php?t=42630 (report mine,
   but there are some people having the same issue)
 - https://forum.odroid.com/viewtopic.php?t=46409
 - https://forum.armbian.com/topic/20582-odroid-xu4-usb-sata-ssd-drive-random-disconnect/

Please let me know if I you need more information.

Thank you,
Jakub Vanek


Attachment: dwc3-logs.tar.gz
Description: application/compressed-tar


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux