Re: [PATCH 18/62] usb: dwc3: core: simplify suspend/resume operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



+Kishon,

On 09/06/16 14:50, Felipe Balbi wrote:
> 
> Hi,
> 
> Roger Quadros <rogerq@xxxxxx> writes:
>>>>> we're only suspending with cable detached, if you need to maintain
>>>>
>>>> That's true only for device mode. What about host mode? Devices are still
>>>> connected to the host port and are most probably just suspended, not
>>>> disconnected.
>>>
>>> You already have a full reenumeration anyway. But to really answer your
>>> question, it depends on how xHCI was integrated in TI's SoC. Can you
>>> process remote wakeup? Does that work today without my PM rework?
>>
>> We've never tested remote wakeup but I have an action item to see if
>> we can make it work.
> 
> alright, let me know.
> 
>>>> For device mode we are ok with loosing connection to the host as
>>>> we don't support the hibernation feature.
>>>>
>>>> My concern is more about the host mode.
>>>
>>> I remember devices being reenumerated in resume with TI's tree. Can you
>>> confirm if that has changed since then?
>>
>> It depends on how deep into power down the SoC goes into. (standby vs deep-sleep)
>> See. 6.4.3.6 Supported Low Power USB Wakeup Scenarios in [1]
>>
>> [1] - AM437x TRM http://www.ti.com/lit/pdf/spruhl7
>>
>> As per that remote-wakeup and connect/disconnect detection (as host) is possible
>> when in standby mode but not when in deep sleep.
> 
> okay, so we actually want to power the hole thing down when in deep
> sleep. What does deep sleep map to in linux parlance? suspend-to-ram?
> suspend-to-disk?

SoC standby == echo standby > /sys/power/state
SoC deepsleep == echo mem > /sys/power/state

> 
> seems like we need better granularity for our suspend/resume calls. That
> is, however, an optimization which can be done for v4.9 as long as my
> previous diff works.
> 
>> This is the kernel log on TI kernel on AM437x
>>
>> root@rockdesk:~# lsusb -t
>> /:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
>> /:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 480M
>>     |__ Port 1: Dev 5, If 0, Class=HID, Driver=usbhid, 12M
>>     |__ Port 1: Dev 5, If 1, Class=HID, Driver=usbhid, 12M
>>     |__ Port 1: Dev 5, If 2, Class=HID, Driver=usbhid, 12M
>> root@rockdesk:~# echo standby > /sys/power/state
>> [  471.406585] PM: Syncing filesystems ... done.
>> [  471.420959] Freezing user space processes ... (elapsed 0.004 seconds) done.
>> [  471.434387] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
>> [  471.444061] Suspending console(s) (use no_console_suspend to debug)
>> [  471.524291] PM: suspend of devices complete after 72.204 msecs
>> [  471.528045] PM: late suspend of devices complete after 3.723 msecs
>> [  471.532470] PM: noirq suspend of devices complete after 4.394 msecs
>> [  471.532470] Disabling non-boot CPUs ...
>> [  471.532470] PM: Could not transition all powerdomains to target state
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
>> [  471.551849] PM: noirq resume of devices complete after 19.134 msecs
>> [  471.555419] PM: early resume of devices complete after 2.899 msecs
>> [  471.556762] net eth0: initializing cpsw version 1.15 (0)
>> [  471.556793] net eth0: initialized cpsw ale version 1.4
>> [  471.634033] net eth0: phy found : id is : 0x221622
>> [  471.977966] usb 1-1: reset full-speed USB device number 5 using xhci-hcd
>> [  472.128082] PM: resume of devices complete after 572.662 msecs
>> [  472.201232] Restarting tasks ... done.
>> -bash: echo: write error: Operation not permitted
> 
> you didn't actually reach standby at all.

Yeah ti-4.4 is still not there yet. I tried it on ti-4.1 with USB_DEFAULT_PERSIST
disabled and I could see the devices disconnect and re-enumerate on am437x.
I do see VBUS being turned off and need to figure out how to prevent that if
we want to support the remote wakeup case.

On dra7-evm I do see VBUS being left ON, but I do see the devices disconnect
and re-enumerate on system resume.
However, if I comment out the phy_exit()/init(), phy_power_off/on() part then
I do see the devices resume correctly.

So looks like we need better granularity for PHY power management as well?

> 
>> root@rockdesk:~# echo mem > /sys/power/state
>> [  490.833740] PM: Syncing filesystems ... done.
>> [  490.847167] Freezing user space processes ... (elapsed 0.004 seconds) done.
>> [  490.859100] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
>> [  490.868774] Suspending console(s) (use no_console_suspend to debug)
>>
>> [  490.931030] PM: suspend of devices complete after 54.077 msecs
>> [  490.935028] PM: late suspend of devices complete after 3.936 msecs
>> [  490.939453] PM: noirq suspend of devices complete after 4.394 msecs
>> [  490.939483] Disabling non-boot CPUs ...
>> [  490.939483] PM: Successfully put all powerdomains to target state
>> [  490.974365] PM: noirq resume of devices complete after 34.729 msecs
>> [  490.977874] PM: early resume of devices complete after 2.899 msecs
>> [  490.979278] net eth0: initializing cpsw version 1.15 (0)
>> [  490.979309] net eth0: initialized cpsw ale version 1.4
>> [  491.053985] net eth0: phy found : id is : 0x221622
>> [  491.182189] usb usb1: root hub lost power or was reset
>> [  491.182220] usb usb2: root hub lost power or was reset
>> [  491.523132] usb 1-1: reset full-speed USB device number 5 using xhci-hcd
>> [  491.678070] PM: resume of devices complete after 700.164 msecs
>> [  491.760070] Restarting tasks ... done.
>> root@rockdesk:~# 
>> root@rockdesk:~# [  494.056823] cpsw 4a100000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
>>
>>
>> How to be sure if device was re-enumerated or not. At least I don't
>> see a disconnect.
> 
> you wouldn't see a disconnect if that happens with IRQs off. The best
> way is to see if VBUS dropped at all. Also, I guess USB Persist plays a
> role here.
> 
> Do you have USB_DEFAULT_PERSIST enabled? Can you try disabling that and
> checking if it changes anything?
> 
>>>>>>> @@ -1149,36 +1135,14 @@ static int dwc3_suspend(struct device *dev)
>>>>>>>  static int dwc3_resume(struct device *dev)
>>>>>>>  {
>>>>>>>  	struct dwc3	*dwc = dev_get_drvdata(dev);
>>>>>>> -	unsigned long	flags;
>>>>>>>  	int		ret;
>>>>>>>  
>>>>>>>  	pinctrl_pm_select_default_state(dev);
>>>>>>>  
>>>>>>> -	usb_phy_set_suspend(dwc->usb2_phy, 0);
>>>>>>> -	usb_phy_set_suspend(dwc->usb3_phy, 0);
>>>>>>> -	ret = phy_power_on(dwc->usb2_generic_phy);
>>>>>>> -	if (ret < 0)
>>>>>>> +	ret = dwc3_core_init(dwc);
>>>>>>> +	if (ret)
>>>>>>>  		return ret;
>>>>>>
>>>>>> You've replaced just restoring gctl with a full core_init().
>>>>>
>>>>> yes, for a reason
>>>>>
>>>>>> We will loose XHCI host controller state and the devices connected to
>>>>>> it will have to be re-enumerated.
>>>>>
>>>>> if you're going to suspend and you drop all your power rails, you will,
>>>>> anyway, see full reenumeration. In fact, that's what I remember from
>>>>> TI's tree as well. When resuming from suspend, there was a full
>>>>> reenumeration of all devices when DWC3 was acting as host. So what are
>>>>> you really pointing at here?
>>>>
>>>> At least on dra7-evm, I don't see VBUS dropping during system suspend.
>>>>
>>>> And without this patch the devices connected in host mode resumed
>>>> correctly.
>>>
>>> But DWC3 is not controlling VBUS, so that shouldn't matter in any
>>> case. IIRC, your VBUS was controlled by a GPIO, right?
>>
>> Nope. VBUS is controlled internally, most likely via some XHCI integration
>> logic.
> 
> on the DRD port? Which Soc are you talking about?

dra7xx SoC has a dedicated usb_drvvbus pin that controls the VBUS regulator. 
The ID and VBUS event input does come over a GPIO though.

> 
> Well, AM437x has the DRVVBUS signal which, for the most part at least,
> works fine in HW mode. There are some concerns about VBUS contention
> (ping Dave K about that, btw. You wanna know about that).
> 
> AM57x, however, has ID and VBUS tied to GPIOs (or was it only VBUS?)

ID is over GPIO, VBUS input comes via PMIC.
It does have usb_drvvbus as well that controls the VBUS regulator.

> 
>>>>>> So this cant be done IMO.
>>>>>
>>>>> why not?
>>>>
>>>> If host controller state is maintained during a suspend/resume then
>>>> you are unnecessary doing a full re-init and loosing context.
>>>> And on dra7-evm that seems to be the case.
>>>
>>> oh, I see. Then that's simple enough to fix :-) Here you go:
>>>
>>> @@ -1094,6 +1094,7 @@ static int dwc3_suspend_common(struct dwc3 *dwc)
>>>  		spin_lock_irqsave(&dwc->lock, flags);
>>>  		dwc3_gadget_suspend(dwc);
>>>  		spin_unlock_irqrestore(&dwc->lock, flags);
>>> +		dwc3_core_exit(dwc);
>>>  		break;
>>>  	case USB_DR_MODE_HOST:
>>>  	default:
>>
>> core_exit twice for PERIPHERAL/OTG case?
>>
>>> @@ -1111,13 +1112,13 @@ static int dwc3_resume_common(struct dwc3 *dwc)
>>>  	unsigned long	flags;
>>>  	int		ret;
>>>  
>>> -	ret = dwc3_core_init(dwc);
>>> -	if (ret)
>>> -		return ret;
>>> -
>>>  	switch (dwc->dr_mode) {
>>>  	case USB_DR_MODE_PERIPHERAL:
>>>  	case USB_DR_MODE_OTG:
>>> +		ret = dwc3_core_init(dwc);
>>> +		if (ret)
>>> +			return ret;
>>> +
>>>  		spin_lock_irqsave(&dwc->lock, flags);
>>>  		dwc3_gadget_resume(dwc);
>>>  		spin_unlock_irqrestore(&dwc->lock, flags);
>>>
>>> Can you test that?
>>
>> Can you send a diff that applies on patch 18 please?
> 
> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> index 80e9affd3d77..9da0b4dfb0ac 100644
> --- a/drivers/usb/dwc3/core.c
> +++ b/drivers/usb/dwc3/core.c
> @@ -1118,6 +1118,7 @@ static int dwc3_suspend(struct device *dev)
>  	case USB_DR_MODE_PERIPHERAL:
>  	case USB_DR_MODE_OTG:
>  		dwc3_gadget_suspend(dwc);
> +		dwc3_core_exit(dwc);
>  		break;
>  	case USB_DR_MODE_HOST:
>  	default:
> @@ -1125,8 +1126,6 @@ static int dwc3_suspend(struct device *dev)
>  		break;
>  	}
>  
> -	dwc3_core_exit(dwc);
> -
>  	pinctrl_pm_select_sleep_state(dev);
>  
>  	return 0;
> @@ -1139,15 +1138,15 @@ static int dwc3_resume(struct device *dev)
>  
>  	pinctrl_pm_select_default_state(dev);
>  
> -	ret = dwc3_core_init(dwc);
> -	if (ret)
> -		return ret;
> -
>  	switch (dwc->dr_mode) {
>  	case USB_DR_MODE_PERIPHERAL:
>  	case USB_DR_MODE_OTG:
> +		ret = dwc3_core_init(dwc);
> +		if (ret)
> +			return ret;
> +
>  		dwc3_gadget_resume(dwc);
> -		/* FALLTHROUGH */
> +		break;
>  	case USB_DR_MODE_HOST:
>  	default:
>  		/* do nothing */
> 
>> And what happens in case where XHCI context is really lost?
> 
> there's really know way of knowing about that today. Moreover, that's
> no different then what we have today.
> 
>> And again if in OTG mode and role was host we do a core_init?
> 
> well, then you should patch that switch() statement, right? We don't
> have OTG in mainline yet, so I can't cover that case, can I?
> 
>>>>>> Why not just restore GCTL instead of doing a full core re-init?
>>>>>
>>>>> restoring GCTL is not enough. Not even close ;-) In fact, unless we have
>>>>> hibernation, there's no way to save all necessary state. Also, if we're
>>>>> going to system suspend we're gonna loose all context anyway (this is
>>>>> true in TI's and Intel's platforms at least, with the slight difference
>>>>> that Intel can support hibernation feature) and, if we're talking about
>>>>> runtime PM, we're only doing that when cable is disconnected which means
>>>>> that saving any state is pointless anyway.
>>>>>
>>>>
>>>> Agreed if the context is lost. But it might not be so in all cases.
>>>> So at least we need to check if controller state was lost or not before
>>>> doing a reset and full re-init.
>>>
>>> we don't have means to know that context has been lost, that discussion
>>> will get us nowhere ;-)
>>>
>> oh ok. That's bad then.
> 
> not really bad, actually. It's just that certain PM details shouldn't be
> dealt with by drivers anyway. PM is very arch-specific and that's why
> it's hidden away in PCI/ACPI or genpd ;-)
> 
> Drivers should save context when possible and when it makes sense. For
> DWC3 it doesn't make sense to even try to save context unless you have
> Hibernation enabled. The short explanation for that is that DWC3's
> context isn't exactly visible to the driver; it's easy to conclude that
> when you realize that endpoints are configured via *commands*. If you
> save and just rewrite the contents of the EP-related registers, that'll
> actually break the core. DWC3 endpoints have a special state machine to
> get started and that's why we _must_ walk through each command or, in
> case hibernation is enabled, we issue a command to save endpoint state.
> 

OK.

So then even without the diff, host is working fine as re-enumeration is
expected.
However device is still broken with the below backtrace on dra7-evm.
I can't test on am437x-evm as suspend/resume doesn't work there.

root@rockdesk:~# ./suspend.sh 
[   51.596774] PM: Syncing filesystems ... done.
[   51.609571] Freezing user space processes ... (elapsed 0.003 seconds) done.
[   51.620806] Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
[   51.652549] PM: suspend of devices complete after 17.709 msecs
[   51.664166] PM: late suspend of devices complete after 5.395 msecs
[   51.682110] omap_hwmod: gpio7: _wait_target_disable failed
[   51.694788] omap_hwmod: gpio6: _wait_target_disable failed
[   51.701182] PM: noirq suspend of devices complete after 30.503 msecs
[   51.707869] Disabling non-boot CPUs ...
[   51.713454] CPU1: shutdown
[   51.740773] Powerdomain (l4per_pwrdm) didn't enter target state 1
[   51.740773] Powerdomain (l3init_pwrdm) didn't enter target state 1
[   51.740773] Could not enter target state in pm_suspend
[   51.740773] A possible cause could be an old bootloader - try u-boot >= v2012.07
[   51.740826] Enabling non-boot CPUs ...
[   51.790136] CPU1: smp_ops.cpu_die() returned, trying to resuscitate
[   51.790666] CPU1 is up
[   51.803574] PM: noirq resume of devices complete after 3.813 msecs

[   51.814593] PM: early resume of devices complete after 3.766 msecs
[   51.823500] net eth0: initializing cpsw version 1.15 (0)
[   51.910502] Generic PHY 48485000.mdio:02: attached PHY driver [Generic PHY] (mii_bus:phy_addr=48485000.mdio:02, irq=-1)
[   51.930423] usb usb1: root hub lost power or was reset
[   51.935812] usb usb2: root hub lost power or was reset
[   52.148494] ata1: SATA link down (SStatus 0 SControl 300)
[   52.166756] PM: resume of devices complete after 345.657 msecs
[   52.177507] Restarting tasks ... 
[   52.181794] usb 1-1: USB disconnect, device number 3
[   52.192765] done.
root@rockdesk:~# 
root@rockdesk:~# [   52.293578] ------------[ cut here ]------------
[   52.298452] WARNING: CPU: 1 PID: 2120 at drivers/usb/dwc3/gadget.c:2253 dwc3_stop_active_transfer.constprop.1+0x94/0xb0 [dwc3]
[   52.310400] Modules linked in: usb_f_ss_lb g_zero libcomposite usb_storage xhci_plat_hcd xhci_hcd usbcore evdev m25p80 spi_nor dwc3 udc_core usb_common snd_soc_davinci_mcasp snd_soc_edma snd_soc_omap snd_soce
[   52.354125] CPU: 1 PID: 2120 Comm: irq/451-dwc3 Tainted: G        W       4.7.0-rc1-00018-gd3a19ee #873
[   52.363977] Hardware name: Generic DRA74X (Flattened Device Tree)
[   52.370384] [<c011010c>] (unwind_backtrace) from [<c010c224>] (show_stack+0x10/0x14)
[   52.378513] [<c010c224>] (show_stack) from [<c0485e48>] (dump_stack+0xac/0xe0)
[   52.386093] [<c0485e48>] (dump_stack) from [<c0137e14>] (__warn+0xd8/0x104)
[   52.393408] [<c0137e14>] (__warn) from [<c0137eec>] (warn_slowpath_null+0x20/0x28)
[   52.401368] [<c0137eec>] (warn_slowpath_null) from [<bf18c894>] (dwc3_stop_active_transfer.constprop.1+0x94/0xb0 [dwc3])
[   52.412839] [<bf18c894>] (dwc3_stop_active_transfer.constprop.1 [dwc3]) from [<bf18d434>] (dwc3_remove_requests+0x20/0xac [dwc3])
[   52.425091] [<bf18d434>] (dwc3_remove_requests [dwc3]) from [<bf18e77c>] (__dwc3_gadget_ep_disable+0x30/0x118 [dwc3])
[   52.436254] [<bf18e77c>] (__dwc3_gadget_ep_disable [dwc3]) from [<bf18ec90>] (dwc3_gadget_ep_disable+0x3c/0xc4 [dwc3])
[   52.447510] [<bf18ec90>] (dwc3_gadget_ep_disable [dwc3]) from [<bf27390c>] (disable_ep+0x2c/0x68 [usb_f_ss_lb])
[   52.458100] [<bf27390c>] (disable_ep [usb_f_ss_lb]) from [<bf274a8c>] (disable_endpoints+0x18/0x50 [usb_f_ss_lb])
[   52.468879] [<bf274a8c>] (disable_endpoints [usb_f_ss_lb]) from [<bf274af0>] (disable_source_sink+0x2c/0x34 [usb_f_ss_lb])
[   52.480487] [<bf274af0>] (disable_source_sink [usb_f_ss_lb]) from [<bf259cc0>] (reset_config+0x48/0x7c [libcomposite])
[   52.491746] [<bf259cc0>] (reset_config [libcomposite]) from [<bf259d20>] (composite_disconnect+0x2c/0x54 [libcomposite])
[   52.503170] [<bf259d20>] (composite_disconnect [libcomposite]) from [<bf17e354>] (usb_gadget_udc_reset+0x10/0x34 [udc_core])
[   52.514972] [<bf17e354>] (usb_gadget_udc_reset [udc_core]) from [<bf18d518>] (dwc3_gadget_reset_interrupt+0x58/0x1ec [dwc3])
[   52.526775] [<bf18d518>] (dwc3_gadget_reset_interrupt [dwc3]) from [<bf18de70>] (dwc3_thread_interrupt+0x32c/0xaac [dwc3])
[   52.538383] [<bf18de70>] (dwc3_thread_interrupt [dwc3]) from [<c01a18b0>] (irq_thread_fn+0x1c/0x54)
[   52.547876] [<c01a18b0>] (irq_thread_fn) from [<c01a1b84>] (irq_thread+0x120/0x1f0)
[   52.555916] [<c01a1b84>] (irq_thread) from [<c015b440>] (kthread+0xd0/0xf0)
[   52.563218] [<c015b440>] (kthread) from [<c01078f0>] (ret_from_fork+0x14/0x24)
[   52.570787] ---[ end trace ba99cf1a8e21418c ]---
[   52.576648] usb 1-1: new high-speed USB device number 4 using xhci-hcd
[   52.717105] usb 1-1: New USB device found, idVendor=0781, idProduct=5539
[   52.724137] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[   52.731651] usb 1-1: Product: Cruzer Micro
[   52.735946] usb 1-1: Manufacturer: SanDisk
[   52.740259] usb 1-1: SerialNumber: 173733160C524A8E
[   52.747831] usb-storage 1-1:1.0: USB Mass Storage device detected
[   52.755166] scsi host1: usb-storage 1-1:1.0

root@rockdesk:~# [   53.757532] scsi 1:0:0:0: Direct-Access     SanDisk  Cruzer Micro     8.02 PQ: 0 ANSI: 0 CCS
[   53.769662] sd 1:0:0:0: [sda] 15695871 512-byte logical blocks: (8.04 GB/7.48 GiB)
[   53.778285] sd 1:0:0:0: [sda] Write Protect is off
[   53.783634] sd 1:0:0:0: [sda] No Caching mode page found
[   53.789240] sd 1:0:0:0: [sda] Assuming drive cache: write through
[   53.803439]  sda:
[   53.807870] sd 1:0:0:0: [sda] Attached SCSI removable disk
[   53.913550] cpsw 48484000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx


--
cheers,
-roger

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux