Re: Mass Storage Gadget Device Falls from SuperSpeed to High Speed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Rob,

Rob Weber <rob@xxxxxxxxxxx> writes:
> On Tue, Apr 02, 2019 at 08:46:16AM +0300, Felipe Balbi wrote:
>> Felipe Balbi <felipe.balbi@xxxxxxxxxxxxxxx> writes:
>> >>> modphy is the USB PHY integrated in your SoC. There's no control for
>> >>> that from OS side, only BIOS unfortunately. There is, however, one thing
>> >>> we can try. DWC3 has several quirk flags for known quirky PHYs; perhaps
>> >>> CHT needs one of those. Can you try with this patch and let me know
>> >>> whether it helps?
>> >>
>> >> Sure thing, I will try tomorrow. Could you possibly explain what a quirk
>> >> is as it relates to the kernel? I see this all over the source tree but
>> >> never knew how it was used. Does the dwc3 also know about "quirks" and
>> >> these particular flags? or are these flags just specific to the kernel
>> >> and its functionality?
>> >>
>> >>> modified   drivers/usb/dwc3/dwc3-pci.c
>> >>> @@ -105,6 +105,8 @@ static int dwc3_byt_enable_ulpi_refclock(struct pci_dev *pci)
>> >>>  static const struct property_entry dwc3_pci_intel_properties[] = {
>> >>>  	PROPERTY_ENTRY_STRING("dr_mode", "peripheral"),
>> >>>  	PROPERTY_ENTRY_BOOL("linux,sysdev_is_parent"),
>> >>> +	PROPERTY_ENTRY_BOOL("snps,dis_u3_susphy_quirk"),
>> >>> +	PROPERTY_ENTRY_BOOL("snps,dis_u2_susphy_quirk"),
>> >>>  	{}
>> >>>  };
>> >>>  
>> >>> These two quirks will PHY suspend. There are other relevant quirk flags
>> >                     ^^^^^^^^
>> >                     will DISABLE phy suspend :-)
>> >
>> >> What do you mean by PHY suspend? Will it disable U2/U3 for the dwc3? I
>> >
>> > No, no. DWC3 can still enter U1/U2/U3, but the PHY will not enter the
>> > matching P1/P2/P3 state.
>> >
>> >> see it modified the DWC3_GUSB2PHYCFG_SUSPHY bit in the configuration
>> >> register, but I don't have access to the dwc3 databook to dig deeper
>> >> into this.
>> >
>> > The second quirk flag (snps,dis_u2_susphy_quirk) will tell dwc3 to *NOT*
>> > request USB2 PHY to enter low power state. While the first flag
>> > (snps,dis_u3_susphy_quirk) will tell dwc3 to *NOT* request USB3 PHY to
>> > enter low power state.
>> >
>> > In reality, they are a single block of HW but they _can_ be different
>> > and even if they are a single block, they _can_ have separate clock
>> > domains and we _may_ be able to control them separately. I haven't read
>> > documentation for modphy because the OS doesn't touch that. If
>> > necessary, I can try to find out more details about it, but I will
>> > probably take some time to find the correct documentation.
>> >
>> >>> which we can try in case these two don't help. I'd like to figure out
>> >>> exactly which quirk flag helps (if any). After that, we would need to
>> >>> check if a similar problem happens on any CHT system or just your
>> >>> design.
>> >>> 
>> >>> If it happens on any other system, then I can make sure we add a quirk
>> >>> flag to all CHTs.
>> >>
>> >> Sounds good!
>> >>
>> >> Thanks for taking the time to answer my questions! It's definitely
>> >> helpful for my understanding of USB. I'm learning quite a bit of
>> >> new information with each email and it's pretty awesome.
>> >
>> > No problems at all, happy to help.
>> 
>> Any updates here? Hopefully the quirk flags above helped.
>
> Thanks for following up. My team and I were doing a bunch of testing

no problem.

> today as well as this past weekend. We are still seeing mixed results
> with the suggestion you provided to us earlier. I unfortunately do not
> have any logs, traces, or analyzer captures to back this up yet, but I
> hope to gather more concrete data tomorrow.

cool, thanks

> There were two main issues we observed with the quirk flags above:
>
> * Host mode no longer worked with these quirks applied. Please keep in

Interesting. This is rather unexpected.

>   mind that we backported part of a patchset [1] to support switching
>   the internal SS mux from the xHCI controller to the xDCI controller.
>   At a quick glance, do you see anything in this patchset that you think
>   might cause a problem with the role switch driver communicating with
>   the xHCI controller when the quirks are applied?

nope, that shouldn't cause the behavior you're seeing.

> * We still saw the drop from SS to HS on a Windows laptop. Linux and Mac
>   seemed to be fine, but there's definitely a chance we did not test
>   with enough samples to confidently say this quirk is stable with those
>   hosts.

okay, so still not good enough. Let's try to prevent the device side
from ever initiating U1/U2 entry and from ever accepting LGO_U1 and
LGO_U2 from the host. Here's a (hack) patch for that

modified   drivers/usb/dwc3/ep0.c
@@ -382,7 +382,7 @@ static int dwc3_ep0_handle_u1(struct dwc3 *dwc, enum usb_device_state state,
 
 	reg = dwc3_readl(dwc->regs, DWC3_DCTL);
 	if (set)
-		reg |= DWC3_DCTL_INITU1ENA;
+		return -EINVAL;
 	else
 		reg &= ~DWC3_DCTL_INITU1ENA;
 	dwc3_writel(dwc->regs, DWC3_DCTL, reg);
@@ -404,7 +404,7 @@ static int dwc3_ep0_handle_u2(struct dwc3 *dwc, enum usb_device_state state,
 
 	reg = dwc3_readl(dwc->regs, DWC3_DCTL);
 	if (set)
-		reg |= DWC3_DCTL_INITU2ENA;
+		return -EINVAL;
 	else
 		reg &= ~DWC3_DCTL_INITU2ENA;
 	dwc3_writel(dwc->regs, DWC3_DCTL, reg);
@@ -625,9 +625,9 @@ static int dwc3_ep0_set_config(struct dwc3 *dwc, struct usb_ctrlrequest *ctrl)
 			 * Enable transition to U1/U2 state when
 			 * nothing is pending from application.
 			 */
-			reg = dwc3_readl(dwc->regs, DWC3_DCTL);
-			reg |= (DWC3_DCTL_ACCEPTU1ENA | DWC3_DCTL_ACCEPTU2ENA);
-			dwc3_writel(dwc->regs, DWC3_DCTL, reg);
+			/* reg = dwc3_readl(dwc->regs, DWC3_DCTL); */
+			/* reg |= (DWC3_DCTL_ACCEPTU1ENA | DWC3_DCTL_ACCEPTU2ENA); */
+			/* dwc3_writel(dwc->regs, DWC3_DCTL, reg); */
 		}
 		break;


I don't think it will apply cleanly to your kernel, let me know if you
need help porting it.
 
> So tomorrow I hope to gather more concrete data to back up these
> findings. Our USB protocol analyzer is also out-of-order at the moment,
> so we might not be able to get analyzer data for a day or two either.
>
> Are there some other quirks you would like us to try out?

See above, not a quirk flag, but just a complete removal of U1/U2 entry
from the device side.

> Over the weekend, one of our team members was hacking in the dwc3 driver
> to try and disable LPM at various points in the chain of events. I've
> attached a patch that we believe is pretty stable. We have never seen it
> drop from SS to HS. USB type A to type C connections from any host to
> our device seem pretty stable. I've attached an archive called
> "gnarbox-success-a-to-c.tar.xz" That contains the trace events for a
> successful run where we connect the device to a Windows host and it
> enumerates at SuperSpeed.

Okay, cool. I'll go over your tarballs after this email.

> This patch isn't entirely stable as I haven't been able to get a
> reliable USB C-to-C connection on any host yet. I do not have any logs
> or traces for this particular case and I hope to get them to you soon.
>
> Do you remember the "failed to enable ep0out" error I briefly
> mentioned earlier? I was able to capture some trace events while this
> occurs. This issue occaisonally occurs when modprobing g_mass_storage.
> The exact steps are as follows:
>
> 1. switch from host mode to device mode (echo device >
> /sys/class/usb_role/.../role)
> 2. modprobe g_mass_storage ...
> 3. Try and rest device mode with a host computer, but you'll observe
> that device mode is not working.
> 4. modprobe -r g_mass_storage to remove the gadget driver.
> 5. switch back to host mode (echo host > /sys/class/usb_role/.../role)
> 6. switch back to device mode
> 7. modprobe g_mass_storage ...
> 8. Observe the error

Hmm, odd. When we remove a gadget driver, every endpoint (including ep0)
should be disabled.

> I've attached the traces and dmesg output in an archive titled
> "gnarbox-ep0out-failure.tar.xz". Please let me know if you need more
> information about this particular issue.

I'll go over the traces, this could be something that has been fixed in
the past.

> Lastly, I started digging through the errata available for our SoC
> series [2] and there are a few that stand out to me, including CHT40,
> CHT47, CHT48, and CHT50. CHT47 is the most concerning because I recall
> seeing an "Unknown" LFPS status coming from the device as it is trying

Right, I saw that one too.

> to leave U2. I asked the support specialists on our IPS ticket about
> these errata and I'm hoping to hear back from them within the day. Do
> any of these errata stand out to you as possible culprits?

I've read them but they're related to the host port, not the peripheral
port. The problem you're facing is on the peripheral side. I'd say
they're unrelated.

> Thanks for taking a look at all of this again!

No problem at all. Happy to help

-- 
balbi

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux