Re: [PATCH 1/3] PCI/ASPM: Use the path max in L1 ASPM latency check

Ian Kumlien <ian.kumlien@xxxxxxxxx> · Thu, 28 Jan 2021 13:41:57 +0100

Sorry about the late reply, been trying to figure out what goes wrong
since this email...

And yes, I think you're right - the fact that it fixed my system was
basically too good to be true =)

On Tue, Jan 12, 2021 at 9:42 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> My guess is the real problem is the Switch is advertising incorrect
> exit latencies.  If the Switch advertised "<64us" exit latency for its
> Upstream Port, we'd compute "64us exit latency + 1us Switch delay =
> 65us", which is more than either 03:00.0 or 04:00.0 can tolerate, so
> we would disable L1 on that upstream Link.
>
> Working around this would require some sort of quirk to override the
> values read from the Switch, which is a little messy.  Here's what I'm
> thinking (not asking you to do this; just trying to think of an
> approach):

The question is if it's the device or if it's the bridge...

Ie, if the device can't quite handle it or if the bridge/switch/port
advertises the wrong latency
I have a friend with a similar motherboard and he has the same latency
values - but his kernel doesn't apply ASPM

I also want to check L0s since it seems to be enabled...

>   - Configure common clock earlier, in pci_configure_device(), because
>     this changes the "read-only" L1 exit latencies in Link
>     Capabilities.
>
>   - Cache Link Capabilities in the pci_dev.
>
>   - Add a quirk to override the cached values for the Switch based on
>     Vendor/Device ID and probably DMI motherboard/BIOS info.

I can't say and I actually think it depends on the actual culprit
which we haven't quite identified yet...

> Bjorn