Hi Nathan, Thanks for your fix. On Thu, Nov 2, 2023 at 5:24 AM Nathan Rossi <nathan@xxxxxxxxxxxxxxx> wrote: > > From: Nathan Rossi <nathan.rossi@xxxxxxxx> > > The i.MX8MP and i.MX8MQ devices both use the same DWC3 controller and > are both affected by a known issue with the controller due to specific > behaviour when park mode is enabled in SuperSpeed host mode operation. > > Under heavy USB traffic from multiple endpoints the controller will > sometimes incorrectly process transactions such that some transactions > are lost, or the controller may hang when processing transactions. When > the controller hangs it does not recover. > > This issue is documented partially within the linux-imx vendor kernel > which references a Synopsys STAR number 9001415732 in commits [1] and > additional details in [2]. Those commits provide some additional > controller internal implementation specifics around the incorrect > behaviour of the SuperSpeed host controller operation when park mode is > enabled. > > The summary of this issue is that the host controller can incorrectly > enter/exit park mode such that part of the controller is in a state > which behaves as if in park mode even though it is not. In this state > the controller incorrectly calculates the number of TRBs available which > results in incorrect access of the internal caches causing the overwrite > of pending requests in the cache which should have been processed but > are ignored. This can cause the controller to drop the requests or hang > waiting for the pending state of the dropped requests. > > The workaround for this issue is to disable park mode for SuperSpeed > operation of the controller through the GUCTL1[17] bit. This is already > available as a quirk for the DWC3 controller and can be enabled via the > 'snps,parkmode-disable-ss-quirk' device tree property. > > It is possible to replicate this failure on an i.MX8MP EVK with a USB > Hub connecting 4 SuperSpeed USB flash drives. Performing continuous > small read operations (dd if=/dev/sd... of=/dev/null bs=16) on the block > devices will result in device errors initially and will eventually > result in the controller hanging. > > [13240.896936] xhci-hcd xhci-hcd.0.auto: WARN Event TRB for slot 4 ep 2 with no TDs queued? > [13240.990708] usb 2-1.3: reset SuperSpeed USB device number 5 using xhci-hcd > [13241.015582] sd 2:0:0:0: [sdc] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=DRIVER_OK cmd_age=0s > [13241.025198] sd 2:0:0:0: [sdc] tag#0 CDB: opcode=0x28 28 00 00 00 03 e0 00 01 00 00 > [13241.032949] I/O error, dev sdc, sector 992 op 0x0:(READ) flags 0x80700 phys_seg 25 prio class 2 > [13272.150710] usb 2-1.2: reset SuperSpeed USB device number 4 using xhci-hcd > [13272.175469] sd 1:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x03 driverbyte=DRIVER_OK cmd_age=31s > [13272.185365] sd 1:0:0:0: [sdb] tag#0 CDB: opcode=0x28 28 00 00 00 03 e0 00 01 00 00 > [13272.193385] I/O error, dev sdb, sector 992 op 0x0:(READ) flags 0x80700 phys_seg 18 prio class 2 > [13434.846556] xhci-hcd xhci-hcd.0.auto: xHCI host not responding to stop endpoint command > [13434.854592] xhci-hcd xhci-hcd.0.auto: xHCI host controller not responding, assume dead > [13434.862553] xhci-hcd xhci-hcd.0.auto: HC died; cleaning up > > [1] https://github.com/nxp-imx/linux-imx/commit/97a5349d936b08cf301730b59e4e8855283f815c > [2] https://github.com/nxp-imx/linux-imx/commit/b4b5cbc5a12d7c3b920d1d7cba0ada3379e4e42b It is a shame that NXP fixed it only in their vendor tree and kept mainline with the issue. This deserves a Fixes tag so that it can be backported to stable kernels. Reviewed-by: Fabio Estevam <festevam@xxxxxxxxx> Thanks