On Sat, Jul 4, 2020 at 7:38 AM Felipe Balbi <balbi@xxxxxxxxxx> wrote: > John Stultz <john.stultz@xxxxxxxxxx> writes: > > On Fri, Jul 3, 2020 at 2:54 AM Felipe Balbi <balbi@xxxxxxxxxx> wrote: > >> John Stultz <john.stultz@xxxxxxxxxx> writes: > >> > I was curious if you or anyone else had any thoughts on how to debug > >> > this further? > >> > >> Try enabling dwc3 tracepoints and collecting working and failing > >> cases. If I were to guess, I would say there's a small race condition > >> between setting pullup and the transceiver sending the VBUS_VALID signal > >> to dwc3. > > > > Trace logs attached. Let me know if you have any further ideas! > > You can see from failure case that we never got a Reset event. This > happens, for instance, when dwc3 doesn't know that VBUS is above > VBUS_VALID threshold (4.4V). When the problem happens, I'm assuming USB > is completely dead, meaning that keeping the cable connected for longer > won't change anything, right? Correct. The only way to get it working is to unplug and replug the cable (sometimes more than once). > In that case, could you dump DWC3 registers (there's a debugfs interface > for that)? I'm mostly interested in the PHY registers, both USB2 and > USB3. Check if the PHYs are suspended in the error case. Here's a diff of the regdump in bad and good cases: --- regdump.bad 2020-07-07 03:44:46.799514793 +0000 +++ regdump.good 2020-07-07 03:44:44.723534198 +0000 @@ -24,7 +24,7 @@ GHWPARAMS7 = 0x04881e8d GDBGFIFOSPACE = 0x00420000 GDBGLTSSM = 0x41090440 -GDBGBMU = 0xa0b08000 +GDBGBMU = 0x20300000 GPRTBIMAP_HS0 = 0x00000000 GPRTBIMAP_HS1 = 0x00000000 GPRTBIMAP_FS0 = 0x00000000 @@ -162,29 +162,29 @@ GEVNTSIZ(0) = 0x00001000 GEVNTCOUNT(0) = 0x00000000 GHWPARAMS8 = 0x00000fea -DCFG = 0x00120804 -DCTL = 0x80f00000 +DCFG = 0x0052082c +DCTL = 0x8cf00a00 DEVTEN = 0x00001217 -DSTS = 0x00000000 +DSTS = 0x00820000 DGCMDPAR = 0x00000000 DGCMD = 0x00000000 -DALEPENA = 0x00000003 +DALEPENA = 0x0000000f DEPCMDPAR2(0) = 0x00000000 -DEPCMDPAR1(0) = 0x17a8e000 +DEPCMDPAR1(0) = 0x15935000 DEPCMDPAR0(0) = 0x00000002 DEPCMD(0) = 0x00000006 DEPCMDPAR2(1) = 0x00000000 -DEPCMDPAR1(1) = 0x02000500 -DEPCMDPAR0(1) = 0x00001000 -DEPCMD(1) = 0x00000001 +DEPCMDPAR1(1) = 0x15935000 +DEPCMDPAR0(1) = 0x00000002 +DEPCMD(1) = 0x00010006 DEPCMDPAR2(2) = 0x00000000 DEPCMDPAR1(2) = 0x00000000 -DEPCMDPAR0(2) = 0x00000001 -DEPCMD(2) = 0x00030002 +DEPCMDPAR0(2) = 0x00000000 +DEPCMD(2) = 0x00020007 DEPCMDPAR2(3) = 0x00000000 DEPCMDPAR1(3) = 0x00000000 -DEPCMDPAR0(3) = 0x00000001 -DEPCMD(3) = 0x00040002 +DEPCMDPAR0(3) = 0x00000000 +DEPCMD(3) = 0x00030007 DEPCMDPAR2(4) = 0x00000000 DEPCMDPAR1(4) = 0x00000000 DEPCMDPAR0(4) = 0x00000001 > If they are, try enabling the quirk flags that disable suspend for the > PHYs (check binding documentation). If that helps, then discuss with > your Silicon Validation guys what are the requirements when it comes to > suspend. Some PHYs are inherently quirky and need some of the quirky > flags dwc3 provides. > > Note that disabling suspend completely is a pretty large hammer that > should only be used if nothing else helps. Some PHYs are happy with a > simple delay of U1/U2/U3 entry but, again, check with your Silicon > Validation folks, likely they have already gone through this during chip > characterization. Unfortunately I don't have any access to silicon validation folks. There is already a number of the quirk bindings in use, but I'll tinker around with them a bit to see if it causes any behavior change. Thanks so much for the ideas and feedback! Much appreciated! -john