Re: [PATCH] PCI: increase D3 delay for AMD Ryzen5/7 XHCI controllers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Oct 26, 2019 at 12:28 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> > pci_pm_resume_noirq
> > - pci_pm_default_resume_early
> > -- pci_raw_set_power_state(D0)
> >
> > At this point, pci_dev_wait() reads PCI_COMMAND to be 0x100403 (32-bit
> > read) - so no wait.
>
> Just thinking out loud here: This is before writing PCI_PM_CTRL. The

It's not - it's after writing PCI_PM_CTRL, but before reading it back.

> device should be in D3hot and 0x100403 is PCI_COMMAND_IO |
> PCI_COMMAND_MEMORY | PCI_COMMAND_INTX_DISABLE (and
> PCI_STATUS_CAP_LIST), which mostly matches your lspci (it's missing
> PCI_COMMAND_MASTER, but maybe that got turned off during suspend).
> It's a little strange that PCI_COMMAND_IO is set because 03:00.3 has
> no I/O BARs, but maybe that was set by BIOS at boot-time.

I also checked PCI_COMMAND before writing PCI_PM_CTRL, it's the same
value 0x403.
Immediately after writing PCI_PM_CTRL, it holds the same value.
10ms later (after pci_dev_d3_sleep()), it holds the same value.
Another 10ms later, it has value 0.

> > pci_raw_set_power_state writes to PM_CTRL and then reads it back
> > with value 0x3.
>
> When you write D0 to PCI_PM_CTRL the device does a soft reset, so
> pci_raw_set_power_state() delays before the next access.
>
> When you read PCI_PM_CTRL again, I think you *should* get either
> 0x0000 (indicating that the device is in D0) or 0xffff (if the read
> failed with a Config Request Retry Status (CRS) because the device
> wasn't ready yet).

PCI_PM_CTRL stats with value 0x103.
Then 0 is written and pci_dev_d3_sleep() delays 10ms.
At this point it has value 0x3.
After an additional 10ms delay, it has value 0.

> I can't explain why you would read 0x0003 (not 0xffff) from
> PCI_PM_CTRL.
>
> What happens if you do a dword read from PCI_VENDOR_ID here (after the
> delay but before pci_dev_wait() or reading PCI_PM_CTRL)?

Vendor ID remains 0x1022 at all points.

> You might also try changing pci_enable_crs() to disable
> PCI_EXP_RTCTL_CRSSVE instead of enabling it to see if that makes any
> difference.  CRS SV has kind of a checkered history and I'm a little
> dubious about whether it buys us anything.

I stubbed out that register write which would have otherwise applied
to 8 PCI devices (but not the XHCI controllers), it still fails in the
same way unless the delay is increased.

> > >   xhci_hcd 0000:03:00.4: Refused to change power state, currently in D3
> >
> > At the point of return from pci_pm_resume_noirq, an extra check I
> > added shows that PCI_COMMAND has value 0x403 (16-bit read).
>
> If PCI_COMMAND is non-zero at that point, I think something's wrong.
> It should be zero by the time pci_raw_set_power_state() reads
> PCI_PM_CTRL after the D3 delay.  By that time, we assume the reset has
> happened and the device is in D0uninitialized and fully accessible.

It looks like we can detect that the reset has failed (or more
precisely, not quite completed) by reading PCI_COMMAND (value not yet
0) or PCI_PM_CTRL (doesn't yet indicate D0 state, we already log a
warning for this case).

>From that angle, another workaround possibility is to catch that case
and then retry the PCI_PM_CTRL write and delay once more.

Daniel



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux