On Friday, July 5, 2019 9:02:01 AM CEST Kai-Heng Feng wrote: > at 19:57, Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > On Mon, May 27, 2019 at 11:57:47AM -0500, Bjorn Helgaas wrote: > >> On Thu, May 23, 2019 at 12:39:23PM +0800, Kai-Heng Feng wrote: > >>> at 04:52, Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > >>>> On Wed, May 22, 2019 at 02:39:56PM -0400, Alan Stern wrote: > >>>>> On Wed, 22 May 2019, Bjorn Helgaas wrote: > >>>>>> On Wed, May 22, 2019 at 11:46:25PM +0800, Kai Heng Feng wrote: > >>>>>>>> On May 22, 2019, at 9:48 PM, Bjorn Helgaas <helgaas@xxxxxxxxxx> > >>>>>>>> wrote: > >>>>>>>> On Wed, May 22, 2019 at 11:42:14AM +0800, Kai Heng Feng wrote: > >>>>>>>>> at 6:23 AM, Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > >>>>>>>>>> On Wed, May 22, 2019 at 12:31:04AM +0800, Kai-Heng Feng wrote: > >>>>>>>>>>> There's an xHC device that doesn't wake when > >>>>>>>>>>> a USB device gets plugged > >>>>>>>>>>> to its USB port. The driver's own runtime > >>>>>>>>>>> suspend callback was called, > >>>>>>>>>>> PME signaling was enabled, but it stays at PCI D0. > >>>>>> > >>>>>>>> ... > >>>>>>>> And I guess this patch basically means we wouldn't call > >>>>>>>> the driver's suspend callback if we're merely going to > >>>>>>>> stay at D0, so the driver would have no idea anything > >>>>>>>> happened. That might match Documentation/power/pci.txt > >>>>>>>> better, because it suggests that the suspend callback is > >>>>>>>> related to putting a device in a low-power state, and D0 > >>>>>>>> is not a low-power state. > >>>>>>> > >>>>>>> Yes, the patch is to let the device stay at D0 and don’t run > >>>>>>> driver’s own runtime suspend routine. > >>>>>>> > >>>>>>> I guess I’ll just proceed to send a V2 with updated commit message? > >>>>>> > >>>>>> Now that I understand what "runtime suspended to D0" means, help me > >>>>>> understand what's actually wrong. > >>>>> > >>>>> Kai's point is that the xhci-hcd driver thinks the device is now > >>>>> in runtime suspend, because the runtime_suspend method has been > >>>>> executed. But in fact the device is still in D0, and as a > >>>>> result, PME signalling may not work correctly. > >>>> > >>>> The device claims to be able to signal PME from D0 (this is from the > >>>> lspci > >>>> in https://bugzilla.kernel.org/show_bug.cgi?id=203673): > >>>> > >>>> 00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 20) (prog-if 30 [XHCI]) > >>>> Capabilities: [50] Power Management version 3 > >>>> Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) > >>>> > >>>> From the xHCI spec r1.0, sec 4.15.2.3, it looks like a connect > >>>> detected while in D0 should assert PME# if enabled (and WCE is > >>>> set). > >>> > >>> I think section 4.15.2.3 is about S3 wake up, no S0 we are > >>> discussing here. > >> > >> S0 and S3 are system-level ideas and have no meaning to an individual > >> PCI device. The xHC is a PCI device and can't tell whether the system > >> as a whole is in S0 or S3. If a PCI device claims to be able to > >> generate PME while in D0, that applies regardless of the system state. > >> > >> xHCI r1.0, sec A.1 says "The host controller should be capable of > >> asserting PME# when in any supported device state." In sec 4.19.2, > >> Figure 42 says PME# should be asserted whenever PMCSR.PME_En=1 and > >> WCE=1 and a connection is detected. > >> > >> Figure 42 also shows that CSC (Connect Status Change) and related bits > >> feed into Port Status Change Event Generation. So I assume the xhci > >> driver normally detects connect/disconnect via CSC, but the runtime > >> suspend method makes it use PME# instead? > >> > >> And the way your patch works is by avoiding that xhci runtime suspend > >> method, so it *always* uses CSC and never uses PME#? If that's the > >> case, we're just papering over a problem without really understanding > >> it. > >> > >> I'm wondering if this platform has a firmware defect. Here's my > >> thinking. The xHC is a Root Complex Integrated Endpoint, so its PME > >> signaling is a little unusual. > >> > >> The typical scenario is that a PCIe device is below a Root Port. In > >> that case, it would send a PME Message upstream to the Root Port. Per > >> PCIe r4.0, sec 6.1.6, when configured for native PME support (for ACPI > >> systems, I assume this means "when firmware has granted PME control to > >> the OS via _OSC"), the Root Port would generate a normal PCI INTx or > >> MSI interrupt: > >> > >> PCI Express-aware software can enable a mode where the Root Complex > >> signals PME via an interrupt. When configured for native PME > >> support, a Root Port receives the PME Message and sets the PME > >> Status bit in its Root Status register. If software has set the PME > >> Interrupt Enable bit in the Root Control register to 1b, the Root > >> Port then generates an interrupt. > >> > >> But on this platform the xHC is a Root Complex Integrated Endpoint, so > >> there is no Root Port upstream from it, and that mechanism can't be > >> used. Per PCIe r4.0, sec 1.3.2.3, RCiEPs signal PME via "the same > >> mechanism as PCI systems" or via Root Complex Event Collectors: > >> > >> An RCiEP must signal PME and error conditions through the same > >> mechanisms used on PCI systems. If a Root Complex Event Collector is > >> implemented, an RCiEP may optionally signal PME and error conditions > >> through a Root Complex Event Collector. > >> > >> This platform has no Root Complex Event Collectors, so the xHC should > >> signal PME via the same mechanism as PCI systems, i.e., asserting a > >> PME# signal. I think this means the OS cannot use native PCIe PME > >> control because it doesn't know what interrupt PME# is connected to. > >> The PCI Firmware Spec r3.2, sec 4.5.1 (also quoted in ACPI v6.2, sec > >> 6.2.11.3), says: > >> > >> PCI Express Native Power Management Events control > >> > >> The firmware sets this bit to 1 to grant control over PCI Express > >> native power management event interrupts (PMEs). If firmware > >> allows the operating system control of this feature, then in the > >> context of the _OSC method, it must ensure that all PMEs are > >> routed to root port interrupts as described in the PCI Express > >> Base Specification. > >> > >> This platform cannot route all PMEs to Root Port interrupts because > >> the xHC RCiEP cannot report PME via a Root Port, so I think its _OSC > >> method should not grant control of PCIe Native Power Management Events > >> to the OS, and I think that would mean we have to use the ACPI > >> mechanism for PME on this platform. > >> > >> Can you confirm or deny any of this line of reasoning? I'm wondering > >> if there's something wrong with the platform's _OSC, so Linux thinks > >> it can use native PME, but that doesn't work for this device. > >> > >>> It’s a platform in development so the name can’t be disclosed. > >> > >> Please attach a complete dmesg log to the bugzilla. You can remove > >> identifying details like the platform name, but I want to see the > >> results of the _OSC negotiation. > > > > Thanks for the dmesg log > > (https://bugzilla.kernel.org/attachment.cgi?id=283109). It shows: > > > > acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3] > > acpi PNP0A08:00: _OSC: platform does not support [SHPCHotplug LTR] > > acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME AER PCIeCapability] > > > > I think it is incorrect for the platform to give the OS native control > > over PME because the OS has no way to know how the RCiEP PMEs are > > routed. But it would be interesting to know how BIOSes on other > > platforms with RCiEPs handle this, and I did post a question to the > > PCI-SIG to see if there's any guidance there. > > Is there any update from PCI-SIG? > > I really think we don’t need wakeup capability in D0 because D0 is a > working state. Well, in theory, devices may stay in D0 over suspend-to-idle and they may need to signal wakeup then. Using PME for that would be kind of handy (if it worked) as it would allow special handling of in-band IRQs to be avoided in that case.