Re: [PATCH 27/32] PCI: pciehp: Support interrupts sent from D3hot

Bjorn Helgaas <helgaas@xxxxxxxxxx> · Thu, 12 Jul 2018 18:03:17 -0500

On Sat, Jun 16, 2018 at 09:25:00PM +0200, Lukas Wunner wrote:
> If a hotplug port is able to send an interrupt, one would naively assume
> that it is accessible at that moment.  After all, if its parent is in
> D3hot and the link to the hotplug port is thus down, how should an
> interrupt come through?
> 
> It turns out that assumption is wrong at least for Thunderbolt:  Even
> though its parents are in D3hot, a Thunderbolt hotplug port is able to
> signal interrupts.  Because the port's config space is inaccessible and
> resuming the parents may sleep, the hard IRQ handler has to defer
> runtime resuming the parents and reading the Slot Status register to the
> IRQ thread.
> 
> If the hotplug port uses a level-triggered INTx interrupt, it needs to
> be masked until the IRQ thread has cleared the signaled events.  For
> simplicity, this commit also masks edge-triggered MSI/MSI-X interrupts.
> Note that if the interrupt is shared (which can only happen for INTx),
> other devices are starved from receiving interrupts until the IRQ thread
> is scheduled, has runtime resumed the hotplug port's parents and has
> read and cleared the Slot Status register.
> 
> That delay is dominated by the 10 ms D3hot->D0 transition time of each
> parent port.  The worst case is a Thunderbolt downstream port at the
> end of a daisy chain:  There may be up to six Thunderbolt controllers
> in-between it and the root port, each comprising an upstream and
> downstream port, plus its own upstream port.  That's 13 x 10 = 130 ms.
> Possible mitigations are polling the interrupt while it's disabled or
> reducing the d3_delay of Thunderbolt ports if possible.
> 
> Open code masking of the interrupt instead of requesting it with the
> IRQF_ONESHOT flag to minimize the period during which it is masked.
> (IRQF_ONESHOT unmasks the IRQ only after the IRQ thread has finished.)
> 
> PCIe r4.0 sec 6.7.3.4 states that "If wake generation is required by the
> associated form factor specification, a hotplug capable Downstream Port
> must support generation of a wakeup event (using the PME mechanism) on
> hotplug events that occur when the system is in a sleep state or the
> Port is in device state D1, D2, or D3Hot."
> 
> This would seem to imply that PME needs to be enabled on the hotplug
> port when it is runtime suspended.  pci_enable_wake() currently doesn't
> enable PME on bridges, it may be necessary to add an exemption for
> hotplug bridges there.  On "Light Ridge" Thunderbolt controllers, the
> PME_Status bit is not set when an interrupt occurs while the hotplug
> port is in D3hot, even if PME is enabled.  (I've tested this on a Mac
> and we hardcode the OSC_PCI_EXPRESS_PME_CONTROL bit to 0 on Macs in
> negotiate_os_control(), modifying it to 1 didn't change the behavior.)
> 
> (Confusingly, section 6.7.3.4 also states that "PME and Hot-Plug Event
> interrupts (when both are implemented) always share the same MSI or
> MSI-X vector".  That would only seem to apply to Root Ports, however
> the section never mentions Root Ports, only Downstream Ports.)

FWIW, based on the "Downstream" entry in the "Terms and Acronyms"
section, I think the "Downstream Port" term would include both Root
Ports and Switch Downstream Ports:

  ... The Ports on a Switch that are not the Upstream Port are
  Downstream Ports. All Ports on a Root Complex are Downstream Ports.