Re: [PATCH] PCI/PM: Mark devices disconnected if their upstream PCIe link is down on resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[+cc Kamil, Chris]

On Mon, Sep 18, 2023 at 08:30:41AM +0300, Mika Westerberg wrote:
> Mark Blakeney reported that when suspending system with a Thunderbolt
> dock connected and then unplugging the dock before resume (which is
> pretty normal flow with laptops), resuming takes long time.
> 
> What happens is that the PCIe link from the root port to the PCIe switch
> inside the Thunderbolt device does not train (as expected, the link is
> upplugged):
> 
> [   34.903158] pcieport 0000:00:07.2: restoring config space at offset 0x24 (was 0x3bf12001, writing 0x3bf12001)
> [   34.903231] pcieport 0000:00:07.0: waiting 100 ms for downstream link
> [   36.140616] pcieport 0000:01:00.0: not ready 1023ms after resume; giving up
> 
> However, at this point we still try the resume the devices below that
> unplugged link:
> 
> [   36.140741] pcieport 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
> ...
> [   36.142235] pcieport 0000:01:00.0: restoring config space at offset 0x38 (was 0xffffffff, writing 0x0)
> ...
> [   36.144702] pcieport 0000:02:02.0: waiting 100 ms for downstream link, after activation
> 
> And this is the link from PCIe switch downstream port to the xHCI on the
> dock:
> 
> [   38.380618] xhci_hcd 0000:03:00.0: not ready 1023ms after resume; waiting
> [   39.420587] xhci_hcd 0000:03:00.0: not ready 2047ms after resume; waiting
> [   41.527250] xhci_hcd 0000:03:00.0: not ready 4095ms after resume; waiting
> [   45.793957] xhci_hcd 0000:03:00.0: not ready 8191ms after resume; waiting
> [   54.113950] xhci_hcd 0000:03:00.0: not ready 16383ms after resume; waiting
> [   71.180576] xhci_hcd 0000:03:00.0: not ready 32767ms after resume; waiting
> ...
> [  105.313963] xhci_hcd 0000:03:00.0: not ready 65535ms after resume; giving up
> [  105.314037] xhci_hcd 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible
> [  105.315640] xhci_hcd 0000:03:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x1ff)
> ...
> 
> This ends up slowing down the resume time considerably. For this reason
> mark these devices as disconnected if the link above them did not train
> properly.
> 
> Fixes: e8b908146d44 ("PCI/PM: Increase wait time after resume")
> Reported-by: Mark Blakeney <mark.blakeney@xxxxxxxxxxxxxxxxxx>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217915
> Signed-off-by: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx>

Applied with Lukas' Reviewed-by to pm for v6.7.

e8b908146d44 appeared in v6.4.  Seems like maybe a candidate for
stable?  IIUC, resume actually does work, but takes 65+ seconds longer
than it should?

Kamil also bisected a 60+ second resume delay to e8b908146d44
(https://lore.kernel.org/r/CA+cBOTeWrsTyANjLZQ=bGoBQ_yOkkV1juyRvJq-C8GOrbW6t9Q@xxxxxxxxxxxxxx),
but IIUC at
https://lore.kernel.org/linux-pci/20230824114300.GU3465@xxxxxxxxxxxxxxxxxx/T/#u
you concluded that Kamil's issue was related to firmware and actually
had nothing to do with e8b908146d44.

Do you still think Kamil's issue is unrelated to e8b908146d44 and this
patch?  If so, how do we handle Kamil's issue?  An answer like "users
of v6.4+ must upgrade their Thunderbolt firmware" seems like it would
be kind of a nightmare for users.

> ---
>  drivers/pci/pci-driver.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index a79c110c7e51..51ec9e7e784f 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -572,7 +572,19 @@ static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
>  
>  static void pci_pm_bridge_power_up_actions(struct pci_dev *pci_dev)
>  {
> -	pci_bridge_wait_for_secondary_bus(pci_dev, "resume");
> +	int ret;
> +
> +	ret = pci_bridge_wait_for_secondary_bus(pci_dev, "resume");
> +	if (ret) {
> +		/*
> +		 * The downstream link failed to come up, so mark the
> +		 * devices below as disconnected to make sure we don't
> +		 * attempt to resume them.
> +		 */
> +		pci_walk_bus(pci_dev->subordinate, pci_dev_set_disconnected,
> +			     NULL);
> +		return;
> +	}
>  
>  	/*
>  	 * When powering on a bridge from D3cold, the whole hierarchy may be
> -- 
> 2.40.1
> 



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux