Re: [PATCH v3] usb: dwc3: core: Do core softreset when switch mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ferry Toth wrote:
> Hi
> 
> Op 21-04-2021 om 21:01 schreef Thinh Nguyen:
>> Ferry Toth wrote:
>>> Hi
>>>
>>> Op 19-04-2021 om 23:23 schreef Thinh Nguyen:
>>>> Ferry Toth wrote:
>>>>> Hi
>>>>>
>>>>> Op 19-04-2021 om 01:03 schreef Thinh Nguyen:
>>>>>> Ferry Toth wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>> Op 17-04-2021 om 16:22 schreef Ferry Toth:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> Op 17-04-2021 om 04:27 schreef Thinh Nguyen:
>>>>>>>>> Ferry Toth wrote:
>>>>>>>>>> Hi
>>>>>>>>>>
>>>>>>>>>> Op 16-04-2021 om 00:23 schreef Thinh Nguyen:
>>>>>>>>>>> Thinh Nguyen wrote:
>>>>>>>>>>>> From: Yu Chen <chenyu56@xxxxxxxxxx>
>>>>>>>>>>>> From: John Stultz <john.stultz@xxxxxxxxxx>
>>>>>>>>>>>>
>>>>>>>>>>>> According to the programming guide, to switch mode for DRD
>>>>>>>>>>>> controller,
>>>>>>>>>>>> the driver needs to do the following.
>>>>>>>>>>>>
>>>>>>>>>>>> To switch from device to host:
>>>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>>>>> 2. Set GCTL.PrtCapDir(host mode)
>>>>>>>>>>>> 3. Reset the host with USBCMD.HCRESET
>>>>>>>>>>>> 4. Then follow up with the initializing host registers sequence
>>>>>>>>>>>>
>>>>>>>>>>>> To switch from host to device:
>>>>>>>>>>>> 1. Reset controller with GCTL.CoreSoftReset
>>>>>>>>>>>> 2. Set GCTL.PrtCapDir(device mode)
>>>>>>>>>>>> 3. Reset the device with DCTL.CSftRst
>>>>>>>>>>>> 4. Then follow up with the initializing registers sequence
>>>>>>>>>>>>
>>>>>>>>>>>> Currently we're missing step 1) to do GCTL.CoreSoftReset and
>>>>>>>>>>>> step
>>>>>>>>>>>> 3) of
>>>>>>>>>>>> switching from host to device. John Stult reported a lockup
>>>>>>>>>>>> issue
>>>>>>>>>>>> seen
>>>>>>>>>>>> with HiKey960 platform without these steps[1]. Similar issue is
>>>>>>>>>>>> observed
>>>>>>>>>>>> with Ferry's testing platform[2].
>>>>>>>>>>>>
>>>>>>>>>>>> So, apply the required steps along with some fixes to Yu Chen's
>>>>>>>>>>>> and John
>>>>>>>>>>>> Stultz's version. The main fixes to their versions are the
>>>>>>>>>>>> missing
>>>>>>>>>>>> wait
>>>>>>>>>>>> for clocks synchronization before clearing
>>>>>>>>>>>> GCTL.CoreSoftReset and
>>>>>>>>>>>> only
>>>>>>>>>>>> apply DCTL.CSftRst when switching from host to device.
>>>>>>>>>>>>
>>>>>>>>>>>> [1]
>>>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20210108015115.27920-1-john.stultz@xxxxxxxxxx/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqLhzN4i3$
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [2]
>>>>>>>>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/0ba7a6ba-e6a7-9cd4-0695-64fc927e01f1@xxxxxxxxx/__;!!A4F2R9G_pg!PW9Jbs4wv4a_zKGgZHN0FYrIpfecPX0Ouq9V3d16Yz-9-GSHqZWsfBAF-WkeqGeZStt4$
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Cc: Andy Shevchenko <andy.shevchenko@xxxxxxxxx>
>>>>>>>>>>>> Cc: Ferry Toth <fntoth@xxxxxxxxx>
>>>>>>>>>>>> Cc: Wesley Cheng <wcheng@xxxxxxxxxxxxxx>
>>>>>>>>>>>> Cc: <stable@xxxxxxxxxxxxxxx>
>>>>>>>>>>>> Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode()
>>>>>>>>>>>> work
>>>>>>>>>>>> properly")
>>>>>>>>>>>> Signed-off-by: Yu Chen <chenyu56@xxxxxxxxxx>
>>>>>>>>>>>> Signed-off-by: John Stultz <john.stultz@xxxxxxxxxx>
>>>>>>>>>>>> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx>
>>>>>>>>>>>> ---
>>>>>>>>>>>> Changes in v3:
>>>>>>>>>>>> - Check if the desired mode is OTG, then keep the old flow
>>>>>>>>>>>> - Remove condition for OTG support only since the device can
>>>>>>>>>>>> still be
>>>>>>>>>>>>        configured DRD host/device mode only
>>>>>>>>>>>> - Remove redundant hw_mode check since __dwc3_set_mode() only
>>>>>>>>>>>> applies
>>>>>>>>>>>> when
>>>>>>>>>>>>        hw_mode is DRD
>>>>>>>>>>>> Changes in v2:
>>>>>>>>>>>> - Initialize mutex per device and not as global mutex.
>>>>>>>>>>>> - Add additional checks for DRD only mode
>>>>>>>>>>>>
>>>>>>>>>>>>       drivers/usb/dwc3/core.c | 27 +++++++++++++++++++++++++++
>>>>>>>>>>>>       drivers/usb/dwc3/core.h |  5 +++++
>>>>>>>>>>>>       2 files changed, 32 insertions(+)
>>>>>>>>>>>>
>>>>>>>>>>> Hi John,
>>>>>>>>>>>
>>>>>>>>>>> If possible, can you run a test with this version on your
>>>>>>>>>>> platform?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Thinh
>>>>>>>>>>>
>>>>>>>>>> I tested this on edison-arduino with this patch on top of
>>>>>>>>>> usb-next
>>>>>>>>>> (5.12-rc7 + "increase BESL baseline to 6" to prevent
>>>>>>>>>> throttling").
>>>>>>>>>>
>>>>>>>>>> On this platform there is a physical switch to switch roles. With
>>>>>>>>>> this
>>>>>>>>>> patch I find:
>>>>>>>>>>
>>>>>>>>>> - switch to host mode always works fine
>>>>>>>>>>
>>>>>>>>>> - switch to gadget mode I need to flip the switch 3x
>>>>>>>>>> (gadget-host-gadget).
>>>>>>>>>>
>>>>>>>>>> An error message appears on the gadget side "dwc3 dwc3.0.auto:
>>>>>>>>>> timed
>>>>>>>>>> out
>>>>>>>>>> waiting for SETUP phase" appears, but then the device connects
>>>>>>>>>> to my
>>>>>>>>>> PC,
>>>>>>>>>> no throttling.
>>>>>>>>>>
>>>>>>>>>> - alternatively I can switch to gadget 1x and then unplug/replug
>>>>>>>>>> the
>>>>>>>>>> cable.
>>>>>>>>>>
>>>>>>>>>> No error message and connects fine.
>>>>>>>>>>
>>>>>>>>>> - if I flip the switch only once, on the PC side I get:
>>>>>>>>>>
>>>>>>>>>>       kernel: usb 1-5: new high-speed USB device number 18
>>>>>>>>>> usingxhci_hcd
>>>>>>>>>>       kernel: usb 1-5: New USB device found, idVendor=1d6b,
>>>>>>>>>>       idProduct=0104, bcdDevice= 1.00 kernel: usb1-5: New USB
>>>>>>>>>> device
>>>>>>>>>>       strings: Mfr=1, Product=2, SerialNumber=3kernel:usb 1-5:
>>>>>>>>>> Product:
>>>>>>>>>>       USBArmory Gadget kernel: usb 1-5: Manufacturer:USBArmory
>>>>>>>>>> kernel:
>>>>>>>>>>       usb 1-5: SerialNumber: 0123456789abcdef kernel:usb 1-5:
>>>>>>>>>> can't
>>>>>>>>>> set
>>>>>>>>>>       config #1, error -110
>>>>>>>>> The device failed at set_configuration() request and timed out. It
>>>>>>>>> probably timed out from the status stage looking at the device err
>>>>>>>>> print.
>>>>>>>>>
>>>>>>>>>> Then if I wait long enough on the gadget side I get:
>>>>>>>>>>
>>>>>>>>>>       root@yuna:~# ifconfig
>>>>>>>>>>
>>>>>>>>>>       usb0:
>>>>>>>>>> flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC> mtu
>>>>>>>>>> 1500
>>>>>>>>>>       inet 169.254.119.239 netmask 255.255.0.0 broadcast
>>>>>>>>>> 169.254.255.255
>>>>>>>>>>       inet6 fe80::a8bb:ccff:fedd:eef1 prefixlen 64 scopeid
>>>>>>>>>> 0x20<link>
>>>>>>>>>>       ether aa:bb:cc:dd:ee:f1 txqueuelen 1000 (Ethernet) RX
>>>>>>>>>> packets
>>>>>>>>>> 490424
>>>>>>>>>>       bytes 735146578 (701.0 MiB) RX errors 0 dropped191
>>>>>>>>>> overruns 0
>>>>>>>>>> frame
>>>>>>>>>>       0 TX packets 35279 bytes 2532746 (2.4 MiB) TX errors 0
>>>>>>>>>> dropped 0
>>>>>>>>>>       overruns 0 carrier 0 collisions 0
>>>>>>>>>>
>>>>>>>>>> (correct would be: inet 10.42.0.221 netmask 255.255.255.0
>>>>>>>>>> broadcast
>>>>>>>>>> 10.42.0.255)
>>>>>>>>>>
>>>>>>>>>> So much improved now, but it seems I am still missing
>>>>>>>>>> something on
>>>>>>>>>> plug.
>>>>>>>>>>
>>>>>>>>> That's great! We can look at it further. Can you capture the
>>>>>>>>> tracepoints
>>>>>>>>> of the issue. Also, can you try with mass_storage gadget to see
>>>>>>>>> if the
>>>>>>>>> result is the same?
>>>>>>>> I have already gser, eem, mass_storage and uac2 combo. When eem
>>>>>>>> fails,
>>>>>>>> the mass_storage and uac2 don't appear (on KDE you get all kind of
>>>>>>>> popups when they appear).
>>>>>>>>
>>>>>>>> So either all works, or all fails.
>>>>>>>>
>>>>>>>> I'll trace this later today.
>>>>>>> Trace capturing switch from host-> gadget  here
>>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6329600/5.12-rc7*2Busb-next.zip__;JQ!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXLOiNwGT$
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> (Issue history:
>>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!Oa6XGH3IqY3wwG5KK4FwPuNA0m3q5bRj7N6vdP-y4sAY6mya-96J90NJ0tJnXNc7KgAw$
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> )
>>>>>>>
>>>>>>> On the PC side this resulted to:
>>>>>>>
>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: new high-speed USB device
>>>>>>> number 12 using xhci_hcd
>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device found,
>>>>>>> idVendor=1d6b, idProduct=0104, bcdDevice= 1.00
>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: New USB device strings:
>>>>>>> Mfr=1,
>>>>>>> Product=2, SerialNumber=3
>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Product: USBArmory Gadget
>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: Manufacturer: USBArmory
>>>>>>> apr 17 18:17:44 delfion kernel: usb 1-5: SerialNumber:
>>>>>>> 0123456789abcdef
>>>>>>> apr 17 18:17:49 delfion kernel: usb 1-5: can't set config #1, error
>>>>>>> -110
>>>>>>>
>>>>>>>
>>>>>>> Thanks for all your help!
>>>>>>>
>>>>>> Looks like it's LPM related again. To confirm, try this:
>>>>>> Disable LPM with this property "snps,usb2-gadget-lpm-disable"
>>>>>> (Note that it's not the same as "snps,dis_enblslpm_quirk")
>>>>> Yes, I confirm this helps.
>>>>>
>>>>> Note: on startup I was in host mode, with gadget cable plugged. The
>>>>> first switch to gadget didn't work, all subsequent switches did
>>>>> work, as
>>>>> well as unplug/plug the cable.
>>>>>
>>>>>> Make sure that your testing kernel has this patch [1]
>>>>>> 475e8be53d04 ("usb: dwc3: gadget: Check for disabled LPM quirk")
>>>>>>
>>>>>> [1]
>>>>>> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-next&id=475e8be53d0496f9bc6159f4abb3ff5f9b90e8de__;!!A4F2R9G_pg!Mvz1Am6Ka_pOBfD0TmsA3821I05Ti8stMgh5r4XzMwZ9dy1Wan-il-DB4h50DmbaU4Zw$
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> The failure you saw was probably due the gadget function attempting
>>>>>> to start a delayed status stage of the SET_CONFIGURATION request.
>>>>>> By this time, the host already put the device in low power.
>>>>>>
>>>>>> The START_TRANSFER command needs to be executed while the device
>>>>>> is on "ON" state (or U0 if eSS). We shouldn't use dwc->link_state
>>>>>> to check for link state because we only enable link state change
>>>>>> interrupt for some controller versions.
>>>>>>
>>>>>> Once you confirms disabling LPM works, try this fix:
>>>>>>
>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>>> index 6227641f2d31..06cdec79244e 100644
>>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>>> @@ -309,10 +309,14 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep
>>>>>> *dep,
>>>>>> unsigned int cmd,
>>>>>>              if (DWC3_DEPCMD_CMD(cmd) == DWC3_DEPCMD_STARTTRANSFER) {
>>>>>>                    int             needs_wakeup;
>>>>>> +               u8              link_state;
>>>>>>     -               needs_wakeup = (dwc->link_state ==
>>>>>> DWC3_LINK_STATE_U1 ||
>>>>>> -                               dwc->link_state ==
>>>>>> DWC3_LINK_STATE_U2||
>>>>>> -                               dwc->link_state ==
>>>>>> DWC3_LINK_STATE_U3);
>>>>>> +               reg = dwc3_readl(dwc->regs, DWC3_DSTS);
>>>>>> +               link_state = DWC3_DSTS_USBLNKST(reg);
>>>>>> +
>>>>>> +               needs_wakeup = (link_state == DWC3_LINK_STATE_U1 ||
>>>>>> +                               link_state == DWC3_LINK_STATE_U2 ||
>>>>>> +                               link_state == DWC3_LINK_STATE_U3);
>>>>>>                      if (unlikely(needs_wakeup)) {
>>>>>>                           ret = __dwc3_gadget_wakeup(dwc);
>>>>>> @@ -1989,6 +1993,8 @@ static int __dwc3_gadget_wakeup(struct dwc3
>>>>>> *dwc)
>>>>>>            case DWC3_LINK_STATE_RESET:
>>>>>>            case DWC3_LINK_STATE_RX_DET:    /* in HS, means Early
>>>>>> Suspend */
>>>>>>            case DWC3_LINK_STATE_U3:        /* in HS, means SUSPEND */
>>>>>> +       case DWC3_LINK_STATE_U2:        /* in HS, means Sleep (L1) */
>>>>>> +       case DWC3_LINK_STATE_U1:
>>>>>>            case DWC3_LINK_STATE_RESUME:
>>>>>>                    break;
>>>>>>            default:
>>>>>>
>>>>> Same (good) result as with "snps,usb2-gadget-lpm-disable". Including
>>>>> first switch from host->gadget not working.
>>>>>
>>>> Great! Not sure why the first switch is not working, but it seems like
>>>> we were able to eliminate quite a few issues. If you have more dwc3
>>>> tracepoints, we can take a look further.
>>> I traced but the file is empty. I captured the registers as well. The
>>> zip file is here:
>>>
>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346271/first-switch.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSDx89HFE$
>>>

<snip>

>>>
>>> Maybe this issue is due to extcon missing the event?
>>  From the info here, it doesn't look like the host platform device was
>> removed on the first switch. Also, as you pointed it out, the extcon
>> event was not shown on the first switch either. Without a notification
>> to switch mode, the dwc3 driver won't do anything. You need to check why
>> that's the case as I can't help much here.
>>
>>>>> After a 2 - 4 minutes the connection is dropped and reconnected.
>>>> Does this occur with LPM disabled also? We can review this issue
>>>> further
>>>> with more dwc3 tracepoints.
>>> I captured connection dropping and reconnecting in this fairly long
>>> trace near the end of the file:
>>>
>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6346323/lost-connection.zip__;!!A4F2R9G_pg!KAmPA0Vw1WUiSxdc5-BKNPyD0klvdr5ucZI3E_C2ojho2rNT9wzMs8HG4qCYSIuUHRz4$
>>>
>>>
>> Nothing obvious stands out as a problem from the dwc3 driver or the
>> controller. I see a (port) reset after 30 seconds of inactivity, which
>> is a typical timeout and recovery mechanism in the upperlayer from host.
>>
>> * Is this a new issue or was it always there?
>> * Does turning off LPM help?
> 
> I reverted my "usb: gadget: increase BESL baseline to 6" and picked
> "usb: dwc3: pci: Enable dis_enblslpm_quirk for Intel Merrifield".
> 
> So this is again the big hammer you suggested earlier to turn off LPM.
> 
> 
> After 15 min (at least 4x longer then normal to get a port reset) I have
> still not seen a port reset.
> 
> 
> So for now I conclude, yes turning off LPM helps.
> 

Ok. Thanks for the updates. Looks like we may have to use the hammer
afterall.

Btw, earlier I mistakenly say "dis_enblslpm_quirk" will disable LPM
completely, it only disables the device going into "sleep" state. If you
want to completely disable LPM, use "usb2-gadget-lpm-disable"

Thanks,
Thinh




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux