On Thu, Apr 29, 2021 at 10:16:03PM +0200, Lukas Wunner wrote: > On Fri, Apr 30, 2021 at 04:36:48AM +0900, Keith Busch wrote: > > On Sun, Mar 28, 2021 at 10:52:00AM +0200, Lukas Wunner wrote: > > > Downstream Port Containment (PCIe Base Spec, sec. 6.2.10) disables the > > > link upon an error and attempts to re-enable it when instructed by the > > > DPC driver. > > > > > > A slot which is both DPC- and hotplug-capable is currently brought down > > > by pciehp once DPC is triggered (due to the link change) and brought up > > > on successful recovery. That's undesirable, the slot should remain up > > > so that the hotplugged device remains bound to its driver. DPC notifies > > > the driver of the error and of successful recovery in pcie_do_recovery() > > > and the driver may then restore the device to working state. > > > > This is a bit strange. The PCIe spec says DPC capable ports suppress > > Link Down events specifically because it will confuse hot-plug > > surprise ports if you don't do that. I'm specifically looking at the > > "Implementation Note" in PCIe Base Spec 5.0 section 6.10.2.4. > > I suppose you mean 6.2.10.4? Oops, yes. > "Similarly, it is recommended that a Port that supports DPC not > Set the Hot-Plug Surprise bit in the Slot Capabilities register. > Having this bit Set blocks the reporting of Surprise Down errors, > preventing DPC from being triggered by this important error, > greatly reducing the benefit of DPC." > > The way I understand this, DPC isn't triggered on Surprise Down if > the port supports surprise removal. Hm, that might be correct, but not sure. I thought the intention was surprise down doesn't trigger on link down if it was because of DPC. > However what this patch aims to fix is the Link Down seen by pciehp > which is caused by DPC containing (other) errors. AER will take links down through the Secondary Bus Reset too, but that's not a problem there. The pciehp_reset_slot() suppresses the event. Can DPC use that? > It seems despite the above-quoted recommendation against it, vendors > do ship ports which support both DPC and surprise removal. > > > > Do these ports have out-of-band Precense Detect capabilities? If so, we > > can ignore Link Down events on DPC capable ports as long as PCIe Slot > > Status PDC isn't set. > > Hm, and what about ports with in-band Presence Detect? That can't be distinguishable from an actual device swap in that case. Suppressing the removal could theoretically bring-up a completely different device as if it were the same one. The NVMe driver replays known device's security keys on initialization. Hot-swap attack?