Re: [PATCH v2 10/10] PCI / ACPI: Whitelist D3 for more PCIe hotplug ports

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 10, 2020 at 05:30:58PM -0500, Bjorn Helgaas wrote:
> [breadcrumbs for the archives: this thread is about
> 26ad34d510a8 ("PCI / ACPI: Whitelist D3 for more PCIe hotplug ports"),
> https://git.kernel.org/linus/26ad34d510a8]
> 
> On Thu, Apr 09, 2020 at 09:54:11AM +0300, Mika Westerberg wrote:
> > On Wed, Apr 08, 2020 at 03:12:08PM -0500, Bjorn Helgaas wrote:
> > > On Wed, Apr 08, 2020 at 09:04:09AM +0300, Mika Westerberg wrote:
> > > > On Tue, Apr 07, 2020 at 06:54:23PM -0500, Bjorn Helgaas wrote:
> > > > > On Thu, Sep 13, 2018 at 05:33:22PM +0300, Mika Westerberg wrote:
> > > > > > In order to have better power management for Thunderbolt PCIe chains,
> > > > > > Windows enables power management for native PCIe hotplug ports if there
> > > > > > is following ACPI _DSD attached to the root port:
> > > > > > 
> > > > > >   Name (_DSD, Package () {
> > > > > >       ToUUID ("6211e2c0-58a3-4af3-90e1-927a4e0c55a4"),
> > > > > >       Package () {
> > > > > >           Package () {"HotPlugSupportInD3", 1}
> > > > > >       }
> > > > > >   })
> > > > > > 
> > > > > > This is also documented in:
> > > > > > 
> > > > > >   https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-pcie-root-ports-supporting-hot-plug-in-d3
> > > > > 
> > > > > This doc basically says that if the platform supplies this _DSD, the
> > > > > root port is "capable of handling hot plug events while in D3 state".
> > > > > 
> > > > > What does that mean?  That statement is not really actionable.  I
> > > > > *assume* it's telling us about some specific hardware or firmware
> > > > > functionality, like maybe we'll get a notification for hotplug events
> > > > > when the device is in D3?  D3hot?  D3cold?  What is the notification?
> > > > > Is it immediate or when the device comes back to D0?  How do we
> > > > > control and field the notification?
> > > > 
> > > > I think it simply gives the OS a hint that it can put PCIe hotplug
> > > > capable port into D3 and expect it to wake up when device is detected.
> > > 
> > > I'd really like more specific details than this.  PCI power management
> > > is explicitly controlled by software, so I don't know what it means
> > > for a bridge to "wake up when device is detected."
> > 
> > Note these are for PCIe which is not the same as the parallel PCI.
> 
> Sure, but the PCI power management framework doesn't need to know
> about PCI/PCIe differences.

Well it needs to deal with certain differences anyway like the required
delays etc.

> > What I mean here is that there is some sort of wake depending whether
> > the link is in L1 or L2/L3 and then resulting the link to go to L0 and
> > PME message being send over to the root complex.
> 
> This is compatible with conventional PCI behavior.  PME_Support in the
> PM Capability tells us whether the device can generate a PME from D0,
> D1, D2, D3hot, or D3cold.  AFAIK, this applies to both conventional
> PCI and PCIe, so I think this _DSD must be telling us something more
> than just "this PCIe device can assert PME from D3hot and D3cold."
> 
> PME_Support tells us whether the device can generate PMEs from D3cold,
> but of course it doesn't say anything about whether hotplug events
> cause those PMEs.
> 
> PCIe r5.0, sec 6.7.3.4, says ports must support PMEs for hotplug
> events while the device is in D1, D2, or D3hot.  It clearly leaves out
> D3cold.
> 
> So maybe this _DSD tells us that this device can generate *hotplug*
> event PMEs from D3cold?

Yes, that sounds reasonable.

> > > Normally Linux would get some kind of notification like a PME, then
> > > execute an ACPI method and/or write PCI_PM_CTRL to put the device back
> > > in D0.
> > 
> > Right.
> > 
> > > Are we talking about D3hot or D3cold?
> > 
> > My interpretation is that both (D3 implies both D3hot and D3cold) but I
> > did not write that spec.
> > 
> > Systems where this is used typically go down to D3cold with the PCIe
> > topology (links are in L2).
> > 
> > > The PCI PM capability already has a way to advertise that it can
> > > generate PME from D3cold.  How is this different from that?
> > 
> > Well you always need some platform (ACPI) help in order to even to
> > D3cold. This applies to waking up as well. The PCIe device may advertise
> > that it supports this but I don't think it can be sure that the system
> > it is connected to has this plumbing.
> 
> Yes, transitions to/from D3cold require platform support.  But those
> transitions are done by ACPI methods, so any plumbing is ACPI's
> problem, and I don't think this _DSD helps the OS with that.
>
> > For PCIe hotplug ports there have been issues that have prevented doing
> > power management for those. The _DSD is there to provide a hint to the
> > OS saying that yeah, this port actually is expected to work even if it
> > goes into D3 (cold/hot).
> 
> That doesn't really help fix bugs in this area or help with future
> refactoring, because it's not specific about what the issues were or
> what the _DSD means.
>
> 
> I think the combination of PCIe r5.0 7.5.2.1 (PME_Support from D0, D1,
> D2, D3hot, D3cold) and 6.7.3.4 (hotplug PME events from D1, D2,
> D3hot) covers everything except hotplug PME events from D3cold.
> 
> So my guess is that this _DSD is a way to tell the OS that this device
> generates PMEs for hotplug events even when the device is in D3cold.
> Does that make any sense?

Yes it does.

> > > Is this _DSD something that *could* be advertised via PCI config
> > > space, i.e., is it completely determined by the Root Port?  Or is it
> > > something that requires ACPI support, so it cannot be done directly by
> > > the hardware device?
> > 
> > You always need help from platform (ACPI) to get into D3cold.
> > 
> > > These are all things we need to know in order to make power management
> > > reliable.
> 
> > > > > > +static inline bool platform_pci_bridge_d3(struct pci_dev *dev)
> > > > > > +{
> > > > > > +	return pci_platform_pm ? pci_platform_pm->bridge_d3(dev) : false;
> > > > > 
> > > > > This patch added a .bridge_d3() implementation for ACPI but
> > > > > not for MID.  What prevents us from calling
> > > > > platform_pci_bridge_d3() on a MID platform and trying to call
> > > > > through a NULL pointer?
> > > > > 
> > > > > Shouldn't we do something like the patch attached below?
> > > > 
> > > > IIRC MID devices in general don't have PCIe ports (so we never
> > > > enable PM for them). Is this a real problem that crashes peoples
> > > > kernels on MID systems? Then yes I think the patch makes sense
> > > > to have.
> > > > 
> > > > I also remember testing some other stuff on one MID system
> > > > (Edison) quite recently and did not see any issues.
> > > 
> > > I have not seen reports of crashes, but I do think this is a real
> > > problem.  The problem is that the code as-is relies on assumptions
> > > ("MID does not have PCI and never will") that are implicit and
> > > impossible to verify, which is a maintenance problem.
> > 
> > Well, there won't be any new MID devices and the only one we support
> > partially is Edison and it does not have a single PCIe port that
> > could even be power managed.
> 
> The problem I'm trying to solve is making this code understandable.
> It doesn't help to assert that "there are no devices that exercise
> this path" because one cannot figure that out by looking at the code.

Fair enough.

> > > > BTW, this patch was merged over a year ago so I'm not sure why you comment
> > > > it now. Or you want me to add incremental changes to it?
> > > 
> > > I was reviewing [1], which updates pci_find_pcie_root_port(), which is
> > > used by acpi_pci_bridge_d3().  I noticed a potential NULL pointer
> > > dereference, which was a distraction.
> > > 
> > > Unless you object, I'll merge something like the patch below to
> > > prevent that distraction for others.
> > 
> > No objections :)
> > 
> > However, in order to make this more maintainable, what if we check in
> > the platform_pci_bridge_d3() itself whether the pointer is NULL and
> > return false? That would work for any future platforms and avoid adding
> > the dummy implementation each time.
> 
> That's a much better idea, thank you!  I'll cc you when I revise the
> patch.

OK, thanks!



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux