On Tue, Aug 29, 2023 at 04:57:54PM +0300, Mathias Nyman wrote: > On 21.8.2023 12.55, Hardik Gajjar wrote: > > On Fri, Aug 18, 2023 at 04:18:30PM +0300, Mathias Nyman wrote: > > > On 18.8.2023 12.23, Hardik Gajjar wrote: > > > > xHCI driver starts the response timer after sending each > > > > command to the device. The default value of this timer is > > > > 5 seconds (XHCI_CMD_DEFAULT_TIMEOUT = HZ*5). This seems > > > > too high in time crtical use case. > > > > > > > > This patch provides an API to change the default value of > > > > the timer from the vendor USB driver. > > > > > > > > The default value will be XHCI_CMD_DEFAULT_TIMEOUT (5 sec) > > > > > > > > Use case: > > > > According to the Smartphone integration certification > > > > requirement in the automotive, the phone connected via USB > > > > should complete enumeration and user space handshake > > > > within 3 sec. > > > > > > The above incorrectly makes it sound as if the command timeout > > > timer causes the delay. > > > > > Thank you, Mathias, for your prompt response. I will enhance the message > > to provide more specificity in the subsequent patch. > > > > > > > > Reducing the response waiting time by setting the smaller > > > > command timer delay helps to speed up overall re-enumeration > > > > process of the USB device in case of device is not responding > > > > properly in first enumeration iteration. > > > > > > So is this a case where addressing a usb device behind xHC always > > > fail on the first attempt, i.e. address device command in xhci > > > never completes. Solution proposed here is to fail faster and > > > retry? > > > > > > Is the rootcause known why first enumeration fails? > > > > > > Does setting old_scheme_first module parameter help? > > > > > Yes, you are correct. The problem occurs when setting the address, > > and in such cases, this patch helps to fail faster and retry. > > > > Unfortunately, the root cause is unknown. This problem is mainly > > observed with Android phones. Upon analyzing the USB analyzer logs, > > it seems that the device is not responding to the SET_ADDRESS request. > > Is this only seen when connecting the device to a Linux xHCI host at USB 3 speeds? > How about connecting to a windows machine? or USB 2 Linux machine with a EHCI host? I could not able to reproduce the issue when connecting the phone to x86 Ubuntu20.04 machine. In our setup, we have a arm64 target with a DWC3 USB3 IP core, connected through a USB hub (specifically designed for automotive use) to the Android phone. Interestingly,when replicating this setup with the x86 Ubuntu machine, we encountered a discrepancy. The primary difference between the two setups lies in the utilization of the DWC3 USB IP core. Working Case: 0,HS,202,0:17.672.512,1.333 us,8 B,I,00,00,SETUP txn (SPLIT),80 06 00 01 00 00 40 00 1,HS,203,0:17.672.512,83 ns,4 B,,00,00, SPLIT packet,78 0D 03 70 1,HS,204,0:17.672.513,66 ns,3 B,,00,00, SETUP packet,2D 00 10 1,HS,205,0:17.672.513,200 ns,11 B,,00,00, DATA0 packet,C3 80 06 00 01 00 00 40 00 DD 94 1,HS,206,0:17.672.514,33 ns,1 B,,00,00, ACK packet,D2 0,HS,207,0:17.672.535,5.625.983 ms,,I,00,00,[256 CSPLIT-SETUP-NYET], Non-Working Case: 0,HS,164,0:10.982.303,3.116 us,8 B,I,00,00,SETUP txn (SPLIT-STALL),00 05 04 00 00 00 00 00 1,HS,165,0:10.982.303,83 ns,4 B,,00,00, SPLIT packet,78 01 02 A0 1,HS,166,0:10.982.303,66 ns,3 B,,00,00, SETUP packet,2D 00 10 1,HS,167,0:10.982.304,200 ns,11 B,,00,00, DATA0 packet,C3 00 05 04 00 00 00 00 00 EB 70 1,HS,168,0:10.982.304,50 ns,1 B,,00,00, ACK packet,D2 1,HS,169,0:10.982.305,866 ns,,,00,00, CSPLIT-SETUP-STALL, 2,HS,170,0:10.982.305,83 ns,4 B,,00,00, SPLIT packet,78 81 02 78 2,HS,171,0:10.982.306,66 ns,3 B,,00,00, SETUP packet,2D 00 10 2,HS,172,0:10.982.306,50 ns,1 B,,00,00, STALL packet,1E Upon close inspection of the USB analyzer data for both cases, it becomes evident that a STALL packet consistently appears in the non-working scenario. This stands in contrast to the working case, where the channel remains open, allowing for the successful execution of the following high-speed enumeration commands. In the event of a stall, we require an interruption from the host to resume communication. Given that there is a fixed 5-second command delay, this interruption can only occur after this timeout period. Therefore, I would like to propose a solution: providing an additional API to control this command timeout or as you suggested, adding a timeout parameter to struct xhci_command, and use that when calling xhci_mod_cmd_timer() without the need to address any potential bugs in the xHCI driver. > > > > > I tried using "old_scheme_first=Y," but that did not help. Below are > > the short logs of the first iteration enumeration failure. > > > > > > The xhci command timeout is more of a xhci internal thing, not sure it's a good > > > idea to add this to hcd. > > > > > > Would it make sense to add a timeout parameter to hcd->driver->address_device(hcd, udev) > > > instead? > > > > > > First priority should of course be finding out why the first enumeration fails, > > > and solve that. > > > > > > Thanks > > > Mathias > > Thanks for the suggestion to modify hcd->driver->address_device. > > I will definitely implement it.However to confirm. > > > > So, if I understand correctly, the idea is to avoid exposing any > > API from the xhci driver, but instead, create an interface in hcd.c (such as sysfs or API) > > and incorporate the delay in address_device as an additional parameter. > > On second thought it only makes sense to do this if we can identify the device in advance > and make a quirk for it. But at this stage we don't know anything about the device. > > So I guess this depends on the width of the problem. > > If this works on windows then we need to figure out what we do differently. > > If this fails with all hosts (well xHCI and EHCI) then hcd level change is probably needed. > For xhci we would pass timeout parameter when calling address_device, for other hosts > the timeout for the SET_ADDRESS control transfer would be adjusted. > > If this only fails when connected behind a xHCI host then a local xHCI change should do. > > > However, in that case, modifying xhci is still necessary as the timer is controlled from there. > > Yes, xhci changes will be needed. > > I suggest adding a timeout parameter to struct xhci_command, and use that when calling > xhci_mod_cmd_timer(). This way we can tailor different timeouts for different commands. > > -Mathias Thank you, Mathias. I plan to proceed with the recommended approach and prepare the second version of the upstream patch for review Hardik