Ferry Toth wrote: > Hi > > Op 21-04-2021 om 21:01 schreef Thinh Nguyen: >> Ferry Toth wrote: >>> Hi >>> >>> Op 19-04-2021 om 23:23 schreef Thinh Nguyen: >>>> Ferry Toth wrote: >>>>> Hi >>>>> >>>>> Op 19-04-2021 om 01:03 schreef Thinh Nguyen: >>>>>> Ferry Toth wrote: >>>>>>> Hi >>>>>>> >>>>>>> Op 17-04-2021 om 16:22 schreef Ferry Toth: >>>>>>>> Hi >>>>>>>> >>>>>>>> Op 17-04-2021 om 04:27 schreef Thinh Nguyen: >>>>>>>>> Ferry Toth wrote: >>>>>>>>>> Hi >>>>>>>>>> >>>>>>>>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen: >>>>>>>>>>> Thinh Nguyen wrote: >>>>>>>>>>>> From: Yu Chen <chenyu56@xxxxxxxxxx> >>>>>>>>>>>> From: John Stultz <john.stultz@xxxxxxxxxx> >>>>>>>>>>>> >>>>>>>>>>>> According to the programming guide, to switch mode for DRD >>>>>>>>>>>> controller, >>>>>>>>>>>> the driver needs to do the following. >>>>>>>>>>>> >>>>>>>>>>>> To switch from device to host: >>>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset >>>>>>>>>>>> 2. Set GCTL.PrtCapDir(host mode) >>>>>>>>>>>> 3. Reset the host with USBCMD.HCRESET >>>>>>>>>>>> 4. Then follow up with the initializing host registers sequence >>>>>>>>>>>> >>>>>>>>>>>> To switch from host to device: >>>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset >>>>>>>>>>>> 2. Set GCTL.PrtCapDir(device mode) >>>>>>>>>>>> 3. Reset the device with DCTL.CSftRst >>>>>>>>>>>> 4. Then follow up with the initializing registers sequence >>>>>>>>>>>> >>>>>>>>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and >>>>>>>>>>>> step >>>>>>>>>>>> 3) of >>>>>>>>>>>> switching from host to device. John Stult reported a lockup >>>>>>>>>>>> issue >>>>>>>>>>>> seen >>>>>>>>>>>> with HiKey960 platform without these steps[1]. Similar issue is >>>>>>>>>>>> observed >>>>>>>>>>>> with Ferry's testing platform[2]. >>>>>>>>>>>> >>>>>>>>>>>> So, apply the required steps along with some fixes to Yu Chen's >>>>>>>>>>>> and John >>>>>>>>>>>> Stultz's version. The main fixes to their versions are the >>>>>>>>>>>> missing >>>>>>>>>>>> wait >>>>>>>>>>>> for clocks synchronization before clearing >>>>>>>>>>>> GCTL.CoreSoftReset and >>>>>>>>>>>> only >>>>>>>>>>>> apply DCTL.CSftRst when switching from host to device. >>>>>>>>>>>> >>>>>>>>>>>> [1] >>>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@xxxxxxxxxx/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> [2] >>>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@xxxxxxxxx/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Cc: Andy Shevchenko <andy.shevchenko@xxxxxxxxx> >>>>>>>>>>>> Cc: Ferry Toth <fntoth@xxxxxxxxx> >>>>>>>>>>>> Cc: Wesley Cheng <wcheng@xxxxxxxxxxxxxx> >>>>>>>>>>>> Cc: <stable@xxxxxxxxxxxxxxx> >>>>>>>>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() >>>>>>>>>>>> work >>>>>>>>>>>> properly") >>>>>>>>>>>> Signed-off-by: Yu Chen <chenyu56@xxxxxxxxxx> >>>>>>>>>>>> Signed-off-by: John Stultz <john.stultz@xxxxxxxxxx> >>>>>>>>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx> >>>>>>>>>>>> --- >>>>>>>>>>>> Changes in v3: >>>>>>>>>>>> - Check if the desired mode is OTG, then keep the old flow >>>>>>>>>>>> - Remove condition for OTG support only since the device can >>>>>>>>>>>> still be >>>>>>>>>>>> configured DRD host/device mode only >>>>>>>>>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only >>>>>>>>>>>> applies >>>>>>>>>>>> when >>>>>>>>>>>> hw_mode is DRD >>>>>>>>>>>> Changes in v2: >>>>>>>>>>>> - Initialize mutex per device and not as global mutex. >>>>>>>>>>>> - Add additional checks for DRD only mode >>>>>>>>>>>> >>>>>>>>>>>> drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++ >>>>>>>>>>>> drivers/usb/dwc3/core.h | 5 +++++ >>>>>>>>>>>> 2 files changed, 32 insertions(+) >>>>>>>>>>>> >>>>>>>>>>> Hi John, >>>>>>>>>>> >>>>>>>>>>> If possible, can you run a test with this version on your >>>>>>>>>>> platform? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Thinh >>>>>>>>>>> >>>>>>>>>> I tested this on edison-arduino with this patch on top of >>>>>>>>>> usb-next >>>>>>>>>> (5.12-rc7 + "increase BESL baseline to 6" to prevent >>>>>>>>>> throttling"). >>>>>>>>>> >>>>>>>>>> On this platform there is a physical switch to switch roles. With >>>>>>>>>> this >>>>>>>>>> patch I find: >>>>>>>>>> >>>>>>>>>> - switch to host mode always works fine >>>>>>>>>> >>>>>>>>>> - switch to gadget mode I need to flip the switch 3x >>>>>>>>>> (gadget-host-gadget). >>>>>>>>>> >>>>>>>>>> An error message appears on the gadget side "dwc3 dwc3.0.auto: >>>>>>>>>> timed >>>>>>>>>> out >>>>>>>>>> waiting for SETUP phase" appears, but then the device connects >>>>>>>>>> to my >>>>>>>>>> PC, >>>>>>>>>> no throttling. >>>>>>>>>> >>>>>>>>>> - alternatively I can switch to gadget 1x and then unplug/replug >>>>>>>>>> the >>>>>>>>>> cable. >>>>>>>>>> >>>>>>>>>> No error message and connects fine. >>>>>>>>>> >>>>>>>>>> - if I flip the switch only once, on the PC side I get: >>>>>>>>>> >>>>>>>>>> kernel: usb 1-5: new high-speed USB device number 18 >>>>>>>>>> usingxhci_hcd >>>>>>>>>> kernel: usb 1-5: New USB device found, idVendor=1d6b, >>>>>>>>>> idProduct=0104, bcdDevice= 1.00 kernel: usb1-5: New USB >>>>>>>>>> device >>>>>>>>>> strings: Mfr=1, Product=2, SerialNumber=3kernel:usb 1-5: >>>>>>>>>> Product: >>>>>>>>>> USBArmory Gadget kernel: usb 1-5: Manufacturer:USBArmory >>>>>>>>>> kernel: >>>>>>>>>> usb 1-5: SerialNumber: 0123456789abcdef kernel:usb 1-5: >>>>>>>>>> can't >>>>>>>>>> set >>>>>>>>>> config #1, error -110 >>>>>>>>> The device failed at set_configuration() request and timed out. It >>>>>>>>> probably timed out from the status stage looking at the device err >>>>>>>>> print. >>>>>>>>> >>>>>>>>>> Then if I wait long enough on the gadget side I get: >>>>>>>>>> >>>>>>>>>> root@yuna:~# ifconfig >>>>>>>>>> >>>>>>>>>> usb0: >>>>>>>>>> flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu >>>>>>>>>> 1500 >>>>>>>>>> inet 169.254.119.239 netmask 255.255.0.0 broadcast >>>>>>>>>> 169.254.255.255 >>>>>>>>>> inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid >>>>>>>>>> 0x20<link> >>>>>>>>>> ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX >>>>>>>>>> packets >>>>>>>>>> 490424 >>>>>>>>>> bytes 735146578 (701.0 MiB) RX errors 0 dropped191 >>>>>>>>>> overruns 0 >>>>>>>>>> frame >>>>>>>>>> 0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0 >>>>>>>>>> dropped 0 >>>>>>>>>> overruns 0 carrier 0 collisions 0 >>>>>>>>>> >>>>>>>>>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0 >>>>>>>>>> broadcast >>>>>>>>>> 10.42.0.255) >>>>>>>>>> >>>>>>>>>> So much improved now, but it seems I am still missing >>>>>>>>>> something on >>>>>>>>>> plug. >>>>>>>>>> >>>>>>>>> That's great! We can look at it further. Can you capture the >>>>>>>>> tracepoints >>>>>>>>> of the issue. Also, can you try with mass_storage gadget to see >>>>>>>>> if the >>>>>>>>> result is the same? >>>>>>>> I have already gser, eem, mass_storage and uac2 combo. When eem >>>>>>>> fails, >>>>>>>> the mass_storage and uac2 don't appear (on KDE you get all kind of >>>>>>>> popups when they appear). >>>>>>>> >>>>>>>> So either all works, or all fails. >>>>>>>> >>>>>>>> I'll trace this later today. >>>>>>> Trace capturing switch from host-> gadget here >>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6329600/5.12-rc7*2Busb-next.zip__;JQ!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXLOiNwGT$ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> (Issue history: >>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXNc7KgAw$ >>>>>>> >>>>>>> >>>>>>> >>>>>>> ) >>>>>>> >>>>>>> On the PC side this resulted to: >>>>>>> >>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device >>>>>>> number 12 using xhci_hcd >>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found, >>>>>>> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00 >>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings: >>>>>>> Mfr=1, >>>>>>> Product=2, SerialNumber=3 >>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget >>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory >>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber: >>>>>>> 0123456789abcdef >>>>>>> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error >>>>>>> -110 >>>>>>> >>>>>>> >>>>>>> Thanks for all your help! >>>>>>> >>>>>> Looks like it's LPM related again. To confirm, try this: >>>>>> Disable LPM with this property "snps,usb2-gadget-lpm-disable" >>>>>> (Note that it's not the same as "snps,dis_enblslpm_quirk") >>>>> Yes, I confirm this helps. >>>>> >>>>> Note: on startup I was in host mode, with gadget cable plugged. The >>>>> first switch to gadget didn't work, all subsequent switches did >>>>> work, as >>>>> well as unplug/plug the cable. >>>>> >>>>>> Make sure that your testing kernel has this patch [1] >>>>>> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk") >>>>>> >>>>>> [1] >>>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de__;!!A4F2R9G_pg!Mvz1Am6Ka_pOBfD0TmsA3821I05Ti8stMgh5r4XzMwZ9dy1Wan-il-DB4h50DmbaU4Zw$ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> The failure you saw was probably due the gadget function attempting >>>>>> to start a delayed status stage of the SET_CONFIGURATION request. >>>>>> By this time, the host already put the device in low power. >>>>>> >>>>>> The START_TRANSFER command needs to be executed while the device >>>>>> is on "ON" state (or U0 if eSS). We shouldn't use dwc->link_state >>>>>> to check for link state because we only enable link state change >>>>>> interrupt for some controller versions. >>>>>> >>>>>> Once you confirms disabling LPM works, try this fix: >>>>>> >>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>>>>> index 6227641f2d31..06cdec79244e 100644 >>>>>> --- a/drivers/usb/dwc3/gadget.c >>>>>> +++ b/drivers/usb/dwc3/gadget.c >>>>>> @@ -309,10 +309,14 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep >>>>>> *dep, >>>>>> unsigned int cmd, >>>>>> if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_STARTTRANSFER) { >>>>>> int needs_wakeup; >>>>>> + u8 link_state; >>>>>> - needs_wakeup = (dwc->link_state == >>>>>> DWC3_LINK_STATE_U1 || >>>>>> - dwc->link_state == >>>>>> DWC3_LINK_STATE_U2|| >>>>>> - dwc->link_state == >>>>>> DWC3_LINK_STATE_U3); >>>>>> + reg = dwc3_readl(dwc->regs, DWC3_DSTS); >>>>>> + link_state = DWC3_DSTS_USBLNKST(reg); >>>>>> + >>>>>> + needs_wakeup = (link_state == DWC3_LINK_STATE_U1 || >>>>>> + link_state == DWC3_LINK_STATE_U2 || >>>>>> + link_state == DWC3_LINK_STATE_U3); >>>>>> if (unlikely(needs_wakeup)) { >>>>>> ret = __dwc3_gadget_wakeup(dwc); >>>>>> @@ -1989,6 +1993,8 @@ static int __dwc3_gadget_wakeup(struct dwc3 >>>>>> *dwc) >>>>>> case DWC3_LINK_STATE_RESET: >>>>>> case DWC3_LINK_STATE_RX_DET: /* in HS, means Early >>>>>> Suspend */ >>>>>> case DWC3_LINK_STATE_U3: /* in HS, means SUSPEND */ >>>>>> + case DWC3_LINK_STATE_U2: /* in HS, means Sleep (L1) */ >>>>>> + case DWC3_LINK_STATE_U1: >>>>>> case DWC3_LINK_STATE_RESUME: >>>>>> break; >>>>>> default: >>>>>> >>>>> Same (good) result as with "snps,usb2-gadget-lpm-disable". Including >>>>> first switch from host->gadget not working. >>>>> >>>> Great! Not sure why the first switch is not working, but it seems like >>>> we were able to eliminate quite a few issues. If you have more dwc3 >>>> tracepoints, we can take a look further. >>> I traced but the file is empty. I captured the registers as well. The >>> zip file is here: >>> >>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346271/first-switch.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSDx89HFE$ >>> <snip> >>> >>> Maybe this issue is due to extcon missing the event? >> From the info here, it doesn't look like the host platform device was >> removed on the first switch. Also, as you pointed it out, the extcon >> event was not shown on the first switch either. Without a notification >> to switch mode, the dwc3 driver won't do anything. You need to check why >> that's the case as I can't help much here. >> >>>>> After a 2 - 4 minutes the connection is dropped and reconnected. >>>> Does this occur with LPM disabled also? We can review this issue >>>> further >>>> with more dwc3 tracepoints. >>> I captured connection dropping and reconnecting in this fairly long >>> trace near the end of the file: >>> >>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346323/lost-connection.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSIuUHRz4$ >>> >>> >> Nothing obvious stands out as a problem from the dwc3 driver or the >> controller. I see a (port) reset after 30 seconds of inactivity, which >> is a typical timeout and recovery mechanism in the upperlayer from host. >> >> * Is this a new issue or was it always there? >> * Does turning off LPM help? > > I reverted my "usb: gadget: increase BESL baseline to 6" and picked > "usb: dwc3: pci: Enable dis_enblslpm_quirk for Intel Merrifield". > > So this is again the big hammer you suggested earlier to turn off LPM. > > > After 15 min (at least 4x longer then normal to get a port reset) I have > still not seen a port reset. > > > So for now I conclude, yes turning off LPM helps. > Ok. Thanks for the updates. Looks like we may have to use the hammer afterall. Btw, earlier I mistakenly say "dis_enblslpm_quirk" will disable LPM completely, it only disables the device going into "sleep" state. If you want to completely disable LPM, use "usb2-gadget-lpm-disable" Thanks, Thinh