On Thu, Jun 16, 2022 at 02:41:10PM +0530, Pavan Kondeti wrote: > Hi Matthias/Krishna, > > On Tue, Jun 14, 2022 at 10:53:35AM -0700, Matthias Kaehlcke wrote: > > On Mon, Jun 13, 2022 at 11:08:32AM -0700, Matthias Kaehlcke wrote: > > > On Mon, Jun 06, 2022 at 01:45:51PM -0700, Matthias Kaehlcke wrote: > > > > On Thu, Jun 02, 2022 at 12:35:42PM -0700, Matthias Kaehlcke wrote: > > > > > Hi Krishna, > > > > > > > > > > with this version I see xHCI errors on my SC7180 based system, like > > > > > these: > > > > > > > > > > [ 65.352605] xhci-hcd xhci-hcd.13.auto: xHC error in resume, USBSTS 0x401, Reinit > > > > > > > > > > [ 101.307155] xhci-hcd xhci-hcd.13.auto: WARN: xHC CMD_RUN timeout > > > > > > > > > > After resume a downstream hub isn't enumerated again. > > > > > > > > > > So far I didn't see those with v13, but I aso saw the first error with > > > > > v16. > > > > > > > > It also happens with v13, but only when a wakeup capable vUSB <= 2 > > > > device is plugged in. Initially I used a wakeup capable USB3 to > > > > Ethernet adapter to trigger the wakeup case, however older versions > > > > of this series that use usb_wakeup_enabled_descendants() to check > > > > for wakeup capable devices didn't actually check for vUSB > 2 > > > > devices. > > > > > > > > So the case were the controller/PHYs is powered down works, but > > > > the controller is unhappy when the runtime PM path is used during > > > > system suspend. > > > > > > The issue isn't seen on all systems using dwc3-qcom and the problem starts > > > during probe(). The expected probe sequence is something like this: > > > > > > dwc3_qcom_probe > > > dwc3_qcom_of_register_core > > > dwc3_probe > > > > > > if (device_can_wakeup(&qcom->dwc3->dev)) > > > ... > > > > > > The important part is that device_can_wakeup() is called after dwc3_probe() > > > has completed. That's what I see on a QC SC7280 system, where wakeup is > > > generally working with these patches. > > > > > > However on a QC SC7180 system dwc3_probe() is deferred and only executed after > > > dwc3_qcom_probe(). As a result the device_can_wakeup() call returns false. > > > With that the controller/driver ends up in an unhappy state after system > > > suspend. > > > > > > Probing is deferred on SC7180 because device_links_check_suppliers() finds > > > that '88e3000.phy' isn't ready yet. > > > > It seems device links could be used to make sure the dwc3 core is present: > > > > Another example for an inconsistent state would be a device link that > > represents a driver presence dependency, yet is added from the consumer’s > > ->probe callback while the supplier hasn’t probed yet: Had the driver core > > known about the device link earlier, it wouldn’t have probed the consumer > > in the first place. The onus is thus on the consumer to check presence of > > the supplier after adding the link, and defer probing on non-presence. > > > > https://www.kernel.org/doc/html/v5.18/driver-api/device_link.html#usage > > > > > > You could add something like this to dwc3_qcom_of_register_core(): > > > > > > device_link_add(dev, &qcom->dwc3->dev, > > DL_FLAG_AUTOREMOVE_CONSUMER | DL_FLAG_AUTOPROBE_CONSUMER); > > > > if (qcom->dwc3->dev.links.status != DL_DEV_DRIVER_BOUND) > > ret = -EPROBE_DEFER; > > > > > I am not very sure how the device_link_add() API works. we are the parent and > creating a depdency on child probe. That does not sound correct to me. The functional dependency is effectively there, the driver already assumes that the dwc3 core was probed when of_platform_populate() returns. The device link itself doesn't create the dependency on the probe(), the check of the link status below does. Another option would be to add a link to the PHYs to the dwc3-qcom node in the device tree, but I don't think that would be a better solution (and I expect Rob would oppose this). I'm open to other solutions, so far the device link is the cleanest that came to my mind. I think the root issue is the driver architecture, with two interdependent drivers for the same IP block, instead of a single framework driver with a common part (dwc3 core) and vendor specific hooks/data. > Any ways, I have another question. > > When dwc3_qcom_of_register_core() returns error back to dwc3_qcom_probe(), we > goto depopulate label which calls of_platform_depopulate() which destroy the > child devices that are populated. how does that ensure that child probe is > completed by the time, our probe is called again. The child device it self is > gone. Is this working because when our probe is called next time, the child > probe depenencies are resolved? Good point! It doesn't really ensure that the child is probed (actually it won't be probed and DL_FLAG_AUTOPROBE_CONSUMER doesn't make sense here), it could happen that dwc3_qcom_probe() is deferred multiple times, but eventually the PHYs should be ready and dwc3_probe() be invoked through of_platform_populate().