On Fri, Oct 25, 2019 at 03:11:49PM +0800, Daniel Drake wrote: > On Fri, Oct 25, 2019 at 1:00 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > That's really strange. Your original message showed: > > > > xhci_hcd 0000:03:00.4: Refused to change power state, currently in D3 > > xhci_hcd 0000:03:00.4: enabling device (0000 -> 0002) > > > > The first line is from pci_raw_set_power_state() reading PCI_PM_CTRL, > > but we can't tell whether the read failed and we got ~0, or it > > succeeded and we got something with just the low two bits set. Can > > you print out the whole value so we can see what happened? > > > > The second line is from pci_enable_resources() reading PCI_COMMAND, > > and it got *0*, not 0x0403 as you got from the CRS experiment. > > Thanks for persisting here. In more detail: > > pci_pm_resume_noirq > - pci_pm_default_resume_early > -- pci_raw_set_power_state(D0) > > At this point, pci_dev_wait() reads PCI_COMMAND to be 0x100403 (32-bit > read) - so no wait. Just thinking out loud here: This is before writing PCI_PM_CTRL. The device should be in D3hot and 0x100403 is PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_INTX_DISABLE (and PCI_STATUS_CAP_LIST), which mostly matches your lspci (it's missing PCI_COMMAND_MASTER, but maybe that got turned off during suspend). It's a little strange that PCI_COMMAND_IO is set because 03:00.3 has no I/O BARs, but maybe that was set by BIOS at boot-time. > pci_raw_set_power_state writes to PM_CTRL and then reads it back > with value 0x3. When you write D0 to PCI_PM_CTRL the device does a soft reset, so pci_raw_set_power_state() delays before the next access. When you read PCI_PM_CTRL again, I think you *should* get either 0x0000 (indicating that the device is in D0) or 0xffff (if the read failed with a Config Request Retry Status (CRS) because the device wasn't ready yet). I can't explain why you would read 0x0003 (not 0xffff) from PCI_PM_CTRL. What happens if you do a dword read from PCI_VENDOR_ID here (after the delay but before pci_dev_wait() or reading PCI_PM_CTRL)? We have CRS "software visibility" enabled, and the expectation in the spec is that software will read PCI_VENDOR_ID to see whether the device is ready: 0x0001 means the read got a CRS completion (device isn't ready), valid Vendor ID means device is ready, and 0xffff indicates some other error. pci_dev_wait() reads PCI_COMMAND, not PCI_VENDOR_ID, so maybe there's some wrinkle in how that's handled. You might also try changing pci_enable_crs() to disable PCI_EXP_RTCTL_CRSSVE instead of enabling it to see if that makes any difference. CRS SV has kind of a checkered history and I'm a little dubious about whether it buys us anything. > > xhci_hcd 0000:03:00.4: Refused to change power state, currently in D3 > > At the point of return from pci_pm_resume_noirq, an extra check I > added shows that PCI_COMMAND has value 0x403 (16-bit read). If PCI_COMMAND is non-zero at that point, I think something's wrong. It should be zero by the time pci_raw_set_power_state() reads PCI_PM_CTRL after the D3 delay. By that time, we assume the reset has happened and the device is in D0uninitialized and fully accessible. > 35ms later, pci_pm_resume is entered, and I checked that at this > point, PCI_COMMAND has value 0. > It then goes on to reach pci_enable_resources(). > > xhci_hcd 0000:03:00.4: enabling device (0000 -> 0002) > > The change in PCI_COMMAND value is just down to timing. > At the end of pci_pm_resume_noirq(), if I log PCI_COMMAND, wait 10ms, > and log PCI_COMMAND again, I see it transition from 0x403 to 0. > > Daniel