On Tue, Feb 05, 2019 at 03:06:56PM -0600, Alexandru Gagniuc wrote: > According to PCIe 3.0, the presence detect state is a logical OR of > in-band and out-of-band presence. With this, we'd expect the presence > state to always be asserted when the link comes up. Do you have a PCIe 4.0 spec? I think 5.0 is about to come out, so it'd be nice to have at least a 4.0 citation (including section number), if you have one. > Not all hardware follows this, and it is possible for the presence to > come up after the link. In this case, the PCIe device would be > erroneously disabled and re-probed. It is possible to distinguish > between a delayed presence and a card swap by looking at the DLL state > changed bit -- The link has to come down if the card is removed. > > Thus, for a device that is probed, present and has its link active, a > lack of a link state change event guarantees we have the same device, > and shutdown of is not needed. s/of is/is/, I guess? I'm hoping Lukas will chime in here; thanks for cc'ing him already. > Signed-off-by: Alexandru Gagniuc <mr.nuke.me@xxxxxxxxx> > --- > > Following some discussion in > "PCI: hotplug: Erroneous removal of hotplug PCI devices" [1] > It became apparent that we can't fully rely presence detect changed > as an indicator to shutdown a device. And in PCIe 4.0 the "logical OR" > requirement is going away, so we can use a way to slightly decouple > presence detect and link active. > > I think the approach here is the simplest, and least likely to > interfere with other mis-behaving hardware (in the PCIe 3.0 sense). > > [1] https://www.spinics.net/lists/linux-pci/msg79564.html Would you mind updating this reference to the https://lore.kernel.org/linux-pci form that doesn't depend on external organizations? https://www.kernel.org/lore.html has some details, but unfortunately lacks a hint about how to construct the URL. What I do is look up the Message-ID and use, e.g., https://lore.kernel.org/linux-pci/<Message-ID> > drivers/pci/hotplug/pciehp_ctrl.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c > index 3f3df4c29f6e..ea0dd3e956be 100644 > --- a/drivers/pci/hotplug/pciehp_ctrl.c > +++ b/drivers/pci/hotplug/pciehp_ctrl.c > @@ -220,13 +220,23 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events) > /* > * If the slot is on and presence or link has changed, turn it off. > * Even if it's occupied again, we cannot assume the card is the same. > + * When the card is swapped, we also expect a change in link state, > + * without which, it's likely presence became high after link-active. > */ > mutex_lock(&ctrl->state_lock); > + present = pciehp_card_present(ctrl); > + link_active = pciehp_check_link_active(ctrl); > switch (ctrl->state) { > case BLINKINGOFF_STATE: > cancel_delayed_work(&ctrl->button_work); > /* fall through */ > case ON_STATE: > + if (present && link_active && > + !(events & PCI_EXP_SLTSTA_DLLSC)) { > + mutex_unlock(&ctrl->state_lock); > + ctrl_warn(ctrl, "Hardware bug: Presence state came up after link"); I'm not 100% sure this is worth KERN_WARN unless the user might be able to do something about it. > + return; > + } > ctrl->state = POWEROFF_STATE; > mutex_unlock(&ctrl->state_lock); > if (events & PCI_EXP_SLTSTA_DLLSC) > -- > 2.19.2 >