On 07/27/2018 12:05 PM, Lukas Wunner wrote: > On Fri, Jul 27, 2018 at 03:52:04PM +0000, Alex_Gagniuc@xxxxxxxxxxxx wrote: >> On 07/27/2018 02:18 AM, Lukas Wunner wrote: >>> On Thu, Jul 26, 2018 at 05:38:50PM -0500, Alex G. wrote: >>>> I was under the impression that a DLLSC or PDSC would trigger the >>>> PCI_DEV_DISCONNECTED bit to be set, blocking any further config access. >>> >>> Only if the Presence Detect State bit in the Slot Status register is >>> not set. >>> >>> I think the idea was that if the card is still in the slot, its driver can >>> be unbound orderly because the device is still accessible. So there's no >>> need to set PCI_DEV_DISCONNECTED. >> >> This sounds counter to what I had intuited. I mentally associate >> DISCONNECTED with "the link is down". I don't understand the idea of a >> driver doing an orderly removal when the link is down. Device registers >> wouldn't be accessible in that case. > > That's basically what I meant, sorry for not being clear. > >>> However if there is a already a new card in the slot, PCI_DEV_DISCONNECTED >>> will erroneously not be set. Also, if card removal was triggered by the >>> link going down but the card is still in the slot, PCI_DEV_DISCONNECTED >>> will also erroneously not be set even though it's inaccessible. >>> >>> The only situations when PCI_DEV_DISCONNECTED should not be set is if the >>> card is being removed by sysfs or by the user pressing the Attention Button. >>> Anything else is a surprise removal. What we need to do is pass down a >>> flag to pciehp_unconfigure_device() to indicate whether one of those two >>> situations is at hand or not, and PCI_DEV_DISCONNECTED should be set >>> depending on that flag. >> >> That makes a little more sense. Is someone working on this? > > Me, but not for 4.19, we're too late in the cycle, I'm going to post > a small number of fixup patches for Bjorn's pci/hotplug branch shortly, > to be included in 4.19, and that's it. > > I posted a patch as part of the series that's now on Bjorn's pci/hotplug > branch which touched PCI_DEV_DISCONNECTED code in pciehp_pci.c, but had > to withdraw that particular patch: > https://patchwork.ozlabs.org/patch/930403/ > > The first problem with that patch was that I hadn't fully understood > yet when to set PCI_DEV_DISCONNECTED and when not to set it. I think PCI_DEV_DISCONNECTED is a documentation issue above all else. The history I was given is that drivers would take a very long time to tear down a device. Config space IO to an nonexistent device took a long while to time out. Performance was one motivation -- and was not documented. > The second problem was that it fixed a deadlock on unplug in a way > that wasn't generic enough. The same deadlock can occur in other > situations. The real fix is to unbind the driver lockless in > pci_stop_dev(). That's non-trivial to achieve, but doable. > I don't think there's a way around that to fix the problem: > https://www.spinics.net/lists/linux-pci/msg73652.html > > There's (still) a lot that can be improved in pciehp, I hope to > keep working on that as time permits. Thanks for all the info. The fix that I was settling on is (pasted) below. Though that seems to conflict a bit with what you are trying to do. Now I'm a little conflicted If I should try to submit the below or not. Alex diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c index 3f518dea856d..5cf02639fa0b 100644 --- a/drivers/pci/hotplug/pciehp_pci.c +++ b/drivers/pci/hotplug/pciehp_pci.c @@ -74,6 +74,7 @@ int pciehp_unconfigure_device(struct slot *p_slot) ctrl_dbg(ctrl, "%s: domain:bus:dev = %04x:%02x:00\n", __func__, pci_domain_nr(parent), parent->number); pciehp_get_adapter_status(p_slot, &presence); + presence = presence && pciehp_check_link_active(ctrl); pci_lock_rescan_remove(); > Thanks, > > Lukas >