Re: dwc3 stuck in U3 state on USB3-only link

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jerry,

On Wed, Feb 08, 2023 at 07:27:04PM -0800, Jerry Zhang wrote:
> We have a custom board with two linux systems connected by USB 3 wires
> only, vbus and USB2 are omitted for space savings. This has pretty
> much worked as the controllers are independent, except for the
> following bug:
> 
> - When the host system (tegra xhci host driver) reboots, the device
> (msm-dwc3) enters the U3 state and never leaves it, even after the
> host powers back up.
> - Also if the device system happens to finish booting before the host,
> the same thing happens, dwc3 gets stuck in U3 and never enumerates.

In both of these scenarios when the host is momentarily offline, what
is state of the superspeed signal lines?  Specifically, does the host
remove terminations from its SSTX lines?
 
> I'm able to get these messages from the dwc3 driver when the host reboots
> 
> [   34.549834] msm-dwc3 a600000.ssusb: msm_dwc3_pwr_irq received
> [   34.555749] msm-dwc3 a600000.ssusb: dwc3_pwr_event_handler irq_stat=28100C
> [   34.562902] msm-dwc3 a600000.ssusb: dwc3_pwr_event_handler link
> state = 0x0006
> [   34.570319] msm-dwc3 a600000.ssusb: dwc3_pwr_event_handler:
> unexpected PWR_EVNT, irq_stat=281000
> [   34.663734] msm-dwc3 a600000.ssusb: msm_dwc3_pwr_irq received
> [   34.669644] msm-dwc3 a600000.ssusb: dwc3_pwr_event_handler irq_stat=2C1004
> [   34.676698] msm-dwc3 a600000.ssusb: dwc3_pwr_event_handler:
> unexpected PWR_EVNT, irq_stat=2C1000
> [   34.686082] dwc3 a600000.dwc3: dwc3_gadget_suspend_interrupt Entry to 3
> [   34.692919] dwc3 a600000.dwc3: Notify controller from
> dwc3_gadget_vbus_draw. mA = 2
> [   34.700777] msm-dwc3 a600000.ssusb:
> DWC3_CONTROLLER_SET_CURRENT_DRAW_EVENT received
> [   34.708648] dwc3 a600000.dwc3: Notify OTG from dwc3_gadget_suspend_interrupt
> [   34.715888] msm-dwc3 a600000.ssusb: DWC3_CONTROLLER_NOTIFY_OTG_EVENT received

(BTW I notice from these msm-dwc3 logs you must be using a Qualcomm SoC
with a downstream kernel.  Though I think the issue is generic enough to
debug with the upstream dwc3 gadget, if it does turn out to be some
vendor-specific behavior then I would ask you to contact us directly for
more focused support.)

If possible please enable dwc3 tracing events as we might be able to see
more info about the specific events that occur when the host reboots.

> I think the main thing I'm looking for is validating my existing
> understanding and confirming a few things I suspect, but am not sure
> of due to unfamiliarity with the details of the USB3 spec:
> 
> - iiuc USB3 power management and states should actually be independent
> from both vbus and usb2 lines as all the negotiation happens with LFPS
> over the USB3 wires.

Yes, but in the corner scenario above with the host going offline, you
might be in a situation in which the device abrutly exits its U0 state
due to a perceived disconnection or lack of communication on the SS
pins.  It might be that the LTSSM could have transitioned to SS.Disabled
state--in which case one of the only ways out of that state is, to quote
from the USB3.2 spec (7.5.1.1.2 Exit from eSS.Disabled):

  "An upstream port shall transition to Rx.Detect only when VBUS
   transitions to valid or a USB 2.0 bus reset is detected."

But since you don't have VBUS nor usb2 lines connected, it's possible
the controller could have gotten stuck here and not have a way out.

:) there is a reason that spec compliant USB3.x implementations must
also provide D+/D- connectivity; not only for backwards compatibility
but also for these sorts of fallback scenarios.

> - I see that entry to U3 requires an LFPS message, but in this case
> the host wouldn't have been able to send a message as it is powering
> off. Is the device also capable of entering U3 due to timeouts and is
> it expected to enter U3 in this situation?

In this case since it's obviously not a U3 entry due to LFPS, the only
other interpretation of the dwc3's U3 link state is that it is a
HS/FS/LS Suspend/L2 state.  This can occur simply by not having activity
on D+/D- lines.

> - Similarly I've seen that exiting from U3 requires an LFPS message.
> My expectation is that the host would wake up all devices on the bus
> with LFPS messages shortly after bootup, and either this isn't
> happening, or the device is failing to receive or process the message.
> If the entry to U3 is expected, how is it then expected to exit U3?

I think what might have happened is that when the host rebooted, the
device must have abruptly exited U0 and went into eSS.Disabled; at that
point the dwc3 controller "falls back" into USB2 mode but since D+/D-
are not connected, it is then perceived as entering USB2 suspend.
Being in eSS.Disabled could explain why it doesn't respond to further
LFPS signaling from the host.

You'd somehow need to get the controller to go back into Rx.Detect.
Since you don't have a way to do USB2 reset on D+/D-, you may need to at
least simulate a VBUS toggle, or forcefully reset the dwc3 controller.

With the QCOM HW there is a register that can do this.  Please refer to
dwc3_qcom_vbus_override_enable() in dwc3-qcom.c for the upstream
implementation.

> I've also tried relevant looking quirks on the gadget side including
> ssp-u3-u0-quirk, u2exit_lfps_quirk, disable_scramble_quirk. I don't
> see a way to explicitly prevent the controller from entering U3 mode,
> is this possible with a register setting?
> 
> Would appreciate any thoughts. If I haven't misunderstood anything,
> the next step would probably be to find a beagle 5000 analyzer and
> track down the LFPS messages.

I this is still a good idea, if at least to see what's happening on the
signal lines at a lower level.  Would be great if it could show the
state of terminatination when the host is rebooting.

Hope that helps,
Jack



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux