On Fri, Apr 10, 2020 at 05:30:58PM -0500, Bjorn Helgaas wrote: > [breadcrumbs for the archives: this thread is about > 26ad34d510a8 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports"), > https://git.kernel.org/linus/26ad34d510a8] > > On Thu, Apr 09, 2020 at 09:54:11AM +0300, Mika Westerberg wrote: > > On Wed, Apr 08, 2020 at 03:12:08PM -0500, Bjorn Helgaas wrote: > > > On Wed, Apr 08, 2020 at 09:04:09AM +0300, Mika Westerberg wrote: > > > > On Tue, Apr 07, 2020 at 06:54:23PM -0500, Bjorn Helgaas wrote: > > > > > On Thu, Sep 13, 2018 at 05:33:22PM +0300, Mika Westerberg wrote: > > > > > > In order to have better power management for Thunderbolt PCIe chains, > > > > > > Windows enables power management for native PCIe hotplug ports if there > > > > > > is following ACPI _DSD attached to the root port: > > > > > > > > > > > > Name (_DSD, Package () { > > > > > > ToUUID ("6211e2c0-58a3-4af3-90e1-927a4e0c55a4"), > > > > > > Package () { > > > > > > Package () {"HotPlugSupportInD3", 1} > > > > > > } > > > > > > }) > > > > > > > > > > > > This is also documented in: > > > > > > > > > > > > https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-pcie-root-ports-supporting-hot-plug-in-d3 > > > > > > > > > > This doc basically says that if the platform supplies this _DSD, the > > > > > root port is "capable of handling hot plug events while in D3 state". > > > > > > > > > > What does that mean? That statement is not really actionable. I > > > > > *assume* it's telling us about some specific hardware or firmware > > > > > functionality, like maybe we'll get a notification for hotplug events > > > > > when the device is in D3? D3hot? D3cold? What is the notification? > > > > > Is it immediate or when the device comes back to D0? How do we > > > > > control and field the notification? > > > > > > > > I think it simply gives the OS a hint that it can put PCIe hotplug > > > > capable port into D3 and expect it to wake up when device is detected. > > > > > > I'd really like more specific details than this. PCI power management > > > is explicitly controlled by software, so I don't know what it means > > > for a bridge to "wake up when device is detected." > > > > Note these are for PCIe which is not the same as the parallel PCI. > > Sure, but the PCI power management framework doesn't need to know > about PCI/PCIe differences. Well it needs to deal with certain differences anyway like the required delays etc. > > What I mean here is that there is some sort of wake depending whether > > the link is in L1 or L2/L3 and then resulting the link to go to L0 and > > PME message being send over to the root complex. > > This is compatible with conventional PCI behavior. PME_Support in the > PM Capability tells us whether the device can generate a PME from D0, > D1, D2, D3hot, or D3cold. AFAIK, this applies to both conventional > PCI and PCIe, so I think this _DSD must be telling us something more > than just "this PCIe device can assert PME from D3hot and D3cold." > > PME_Support tells us whether the device can generate PMEs from D3cold, > but of course it doesn't say anything about whether hotplug events > cause those PMEs. > > PCIe r5.0, sec 6.7.3.4, says ports must support PMEs for hotplug > events while the device is in D1, D2, or D3hot. It clearly leaves out > D3cold. > > So maybe this _DSD tells us that this device can generate *hotplug* > event PMEs from D3cold? Yes, that sounds reasonable. > > > Normally Linux would get some kind of notification like a PME, then > > > execute an ACPI method and/or write PCI_PM_CTRL to put the device back > > > in D0. > > > > Right. > > > > > Are we talking about D3hot or D3cold? > > > > My interpretation is that both (D3 implies both D3hot and D3cold) but I > > did not write that spec. > > > > Systems where this is used typically go down to D3cold with the PCIe > > topology (links are in L2). > > > > > The PCI PM capability already has a way to advertise that it can > > > generate PME from D3cold. How is this different from that? > > > > Well you always need some platform (ACPI) help in order to even to > > D3cold. This applies to waking up as well. The PCIe device may advertise > > that it supports this but I don't think it can be sure that the system > > it is connected to has this plumbing. > > Yes, transitions to/from D3cold require platform support. But those > transitions are done by ACPI methods, so any plumbing is ACPI's > problem, and I don't think this _DSD helps the OS with that. > > > For PCIe hotplug ports there have been issues that have prevented doing > > power management for those. The _DSD is there to provide a hint to the > > OS saying that yeah, this port actually is expected to work even if it > > goes into D3 (cold/hot). > > That doesn't really help fix bugs in this area or help with future > refactoring, because it's not specific about what the issues were or > what the _DSD means. > > > I think the combination of PCIe r5.0 7.5.2.1 (PME_Support from D0, D1, > D2, D3hot, D3cold) and 6.7.3.4 (hotplug PME events from D1, D2, > D3hot) covers everything except hotplug PME events from D3cold. > > So my guess is that this _DSD is a way to tell the OS that this device > generates PMEs for hotplug events even when the device is in D3cold. > Does that make any sense? Yes it does. > > > Is this _DSD something that *could* be advertised via PCI config > > > space, i.e., is it completely determined by the Root Port? Or is it > > > something that requires ACPI support, so it cannot be done directly by > > > the hardware device? > > > > You always need help from platform (ACPI) to get into D3cold. > > > > > These are all things we need to know in order to make power management > > > reliable. > > > > > > > +static inline bool platform_pci_bridge_d3(struct pci_dev *dev) > > > > > > +{ > > > > > > + return pci_platform_pm ? pci_platform_pm->bridge_d3(dev) : false; > > > > > > > > > > This patch added a .bridge_d3() implementation for ACPI but > > > > > not for MID. What prevents us from calling > > > > > platform_pci_bridge_d3() on a MID platform and trying to call > > > > > through a NULL pointer? > > > > > > > > > > Shouldn't we do something like the patch attached below? > > > > > > > > IIRC MID devices in general don't have PCIe ports (so we never > > > > enable PM for them). Is this a real problem that crashes peoples > > > > kernels on MID systems? Then yes I think the patch makes sense > > > > to have. > > > > > > > > I also remember testing some other stuff on one MID system > > > > (Edison) quite recently and did not see any issues. > > > > > > I have not seen reports of crashes, but I do think this is a real > > > problem. The problem is that the code as-is relies on assumptions > > > ("MID does not have PCI and never will") that are implicit and > > > impossible to verify, which is a maintenance problem. > > > > Well, there won't be any new MID devices and the only one we support > > partially is Edison and it does not have a single PCIe port that > > could even be power managed. > > The problem I'm trying to solve is making this code understandable. > It doesn't help to assert that "there are no devices that exercise > this path" because one cannot figure that out by looking at the code. Fair enough. > > > > BTW, this patch was merged over a year ago so I'm not sure why you comment > > > > it now. Or you want me to add incremental changes to it? > > > > > > I was reviewing [1], which updates pci_find_pcie_root_port(), which is > > > used by acpi_pci_bridge_d3(). I noticed a potential NULL pointer > > > dereference, which was a distraction. > > > > > > Unless you object, I'll merge something like the patch below to > > > prevent that distraction for others. > > > > No objections :) > > > > However, in order to make this more maintainable, what if we check in > > the platform_pci_bridge_d3() itself whether the pointer is NULL and > > return false? That would work for any future platforms and avoid adding > > the dummy implementation each time. > > That's a much better idea, thank you! I'll cc you when I revise the > patch. OK, thanks!