Thanks Bjorn, With the fixes below, managed add remove via sysfs seems to work on my SKX system. I'm not familiar with runtime PM aspects, just started looking into it after this one. There seems some interactions with ASPM and how we handle devices that support ARI for e.g. For hotplug, we have a good set of tests to check coverage. That matrix might need to be expanded for runtime PM interactions. Cheers, Ashok On Thu, Feb 09, 2017 at 09:09:50AM -0600, Bjorn Helgaas wrote: > [+cc Ashok, Keith] > > > > > https://patchwork.kernel.org/patch/9557113/ > > https://patchwork.kernel.org/patch/9562007/ > > I don't think we've gotten to the root cause of the problem yet, > and I don't want to throw in fixes at the last minute without a better > understanding of it. > > PCIe hotplug hardware is not very complicated, it hasn't changed in > many years, and at least for the Intel hardware in question, is > generally pretty well-tested with Windows. So I want to be careful > about asserting that this new piece of hardware is broken. > > I think pciehp is unnecessarily complicated, and we do have known > synchronization issues with it, e.g., [1] [2]. It seems possible that > if we poked a little deeper, we would find that the hardware is > actually working correctly and the real problem is in pciehp. > > That's why I've been trying to have a conversation about how we > interpret the spec and how we could remove PM and pciehp from the > picture and experiment directly with setpci. > > [1] https://lkml.kernel.org/r/1481317564-18045-1-git-send-email-ashok.raj@xxxxxxxxx > [2] https://bugzilla.kernel.org/show_bug.cgi?id=117561