On Wed, Aug 28, 2019 at 4:43 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote: > > I'll report it to the vendor, > > Yes, please. At least try to get the information on what the exact > platform expectations with respect to the OS are. Quite evidently, > they aren't just "do the usual thing". AMD's response was: > It’s modern stanby system, we don’t have any resource to debug Linux issue on it. I imagine there are people in AMD that do care, but we don't have the right contacts here, not sure if you happen to have anyone you could forward this thread to just in case. Anyway, I looked again and found some more interesting points and a likely workaround, if you can help me nail down the details. I checked again the experiment of runtime-suspending and resuming the USB controller. As before, the problematic step is waking up. In pci_raw_set_power_state() the pmcsr is first read as value 103, then written as 0, then it msleeps for 10ms (in pci_dev_d3_sleep()), then reads back value 3. What I hadn't spotted before is that even though it failed to change the power state bits, bit 8 did get successfully unset, indicating that the device is not completely dead. I then increased the msleep delay to 20ms and now it resumes fine & USB devices work. Unfortunately it's not quite as simple as quirking d3_delay though, because the problem still happens upon resume from s2idle. In that case, pci_dev_d3_sleep() is not called at all. if (state == PCI_D3hot || dev->current_state == PCI_D3hot) pci_dev_d3_sleep(dev); In the runtime resume case, dev->current_state == PCI_D3hot here (even though pci_set_power_state had been called to put it into D3cold), so the msleep happens. But in the system sleep (s2idle) case, dev->current_state == PCI_D3cold here, so no sleep happens. That is strangely inconsistent - is it a bug? I also noticed that there is a 100ms d3cold_delay, but that seems to happen before pcmsr is accessed at all, and doesn't have take any effect here. However, I did also notice that there is no d3cold_delay done during wakeup from s2idle, it only happens on wakeup from runtime suspend. The code does seem to be written that way (runtime_d3cold flag) but I wonder if that is correct. From the standpoint of the ACPI PM specs, is there a difference between runtime suspend and s2idle suspend? Since there is no firmware-based system suspend happening I wonder if the d3cold_delay should apply in both cases. I compared behaviour to another system with Amd Ryzen5 3500U. It's not quite the same SoC but the XHCI controllers have the same PCI IDs. On that platform, I was able to reproduce the failure to runtime resume, but then it succeeds with a d3_delay of 20ms. On system suspend/resume, this other platform uses S3, and the XHCI controller is already in D0 upon resume (looks like the firmware turns on the USB controllers for us, so Linux avoids any difficulties there). That seems to agree that quirking these XHCI controllers (based on PCI ID) to have a d3_delay of 20ms seems sane, but first we need to nail down why that delay is not applied at all during resume from s2idle. Daniel