Re: [PATCH] PCI/PM: Mark devices disconnected if their upstream PCIe link is down on resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 22, 2023 at 07:59:26AM -0500, Bjorn Helgaas wrote:
> [+cc Thorsten]
> 
> On Fri, Sep 22, 2023 at 07:42:37AM +0300, Mika Westerberg wrote:
> > On Thu, Sep 21, 2023 at 03:19:45PM -0500, Bjorn Helgaas wrote:
> > > On Mon, Sep 18, 2023 at 08:30:41AM +0300, Mika Westerberg wrote:
> > ...
> 
> > > Kamil also bisected a 60+ second resume delay to e8b908146d44
> > > (https://lore.kernel.org/r/CA+cBOTeWrsTyANjLZQ=bGoBQ_yOkkV1juyRvJq-C8GOrbW6t9Q@xxxxxxxxxxxxxx),
> > > but IIUC at
> > > https://lore.kernel.org/linux-pci/20230824114300.GU3465@xxxxxxxxxxxxxxxxxx/T/#u
> > > you concluded that Kamil's issue was related to firmware and actually
> > > had nothing to do with e8b908146d44.
> > > 
> > > Do you still think Kamil's issue is unrelated to e8b908146d44 and this
> > > patch?  If so, how do we handle Kamil's issue?  An answer like "users
> > > of v6.4+ must upgrade their Thunderbolt firmware" seems like it would
> > > be kind of a nightmare for users.
> > 
> > It's a different issue. What happens in his system is that the link went
> > down even though the dock was still connected and this should not happen
> > (the firmware should bring the link up during resume). The delay was
> > just a "symptom".
> 
> Do you have any leads for Kamil's issue?  If we had known that
> e8b908146d44 would cause that problem, we never would have applied it
> in the first place.

I explained it in the other email I just sent. I should mention here
that the two issues are different.

> No OS would accept that resume delay, so there must be some way to fix
> that in the OS without requiring a firmware update.

It is not "resume" delay. It is the delay what we wait for the device to
become ready until we decide it is not functional/disconnect. That delay
is completely arbitrary.

> If Kamil's issue is that firmware doesn't bring up the link during
> resume, how *does* the link get brought up, and what does the delay
> have to do with it?

The PCIe tunnel (the "link" above) gets established after D3cold by the
Thunderbolt firmware running inside the host controller. The trigger is
typically when _PR0 ACPI method is called, this sends special command
through the mailbox that makes the firmware re-connect all the tunnels
that were previously connected.

The delay we are talking about here is the PCIe spec required delay
after the device went through a reset that the OS must observe before it
can send configuration requests to that device. Now, the PCIe spec does
not specify how long the OS should wait for device on a link that does
not come up. We increased that delay to the ~60s to fix another issue on
a xHCI controller but forgot the fact that when the device is
deliberately unplugged we still wait for the ~60s which is wasted effort
and just ends up annoying users.



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux