Re: Should a PCIe Link Down event set the PCI_DEV_DISCONNECTED bit?

<Alex_Gagniuc@xxxxxxxxxxxx> · Tue, 31 Jul 2018 16:35:51 +0000

On 07/31/2018 04:29 AM, Lukas Wunner wrote:
> On Mon, Jul 30, 2018 at 09:38:04PM +0000, Alex_Gagniuc@xxxxxxxxxxxx wrote:
>> On 07/28/2018 01:31 PM, Lukas Wunner wrote:
>>> On Fri, Jul 27, 2018 at 05:51:04PM +0000, Alex_Gagniuc@xxxxxxxxxxxx wrote:
>>>> I think PCI_DEV_DISCONNECTED is a documentation issue above all else.
>>>> The history I was given is that drivers would take a very long time to
>>>> tear down a device. Config space IO to an nonexistent device took a long
>>>> while to time out. Performance was one motivation -- and was not
>>>> documented.
>>>
>>> Often it is possible for the driver to detect surprise removal by
>>> checking if mmio reads return "all ones".  But in some cases that's
>>> a valid value to read from mmio and then this approach won't work.
>>> Also, checking every mmio read may negatively impact performance.
>>
>> A colleague and me beat that dead horse to the afterdeath. Consensus was
>> that the return value is less reliable than a coin toss (of a two-heads
>> coin).
> 
> Can you elaborate why?  Because the "official" stance is that checking
> every read where "all ones" is an invalid value is the proper way to
> detect unplugged devices.  (Official as in, voiced by Greg KH and Bjorn.)
> In that sense, PCI_DEV_DISCONNECTED is sort of an unloved child.

All ones is not necessarily invalid. The bug surface is every single 
config read. This approach doesn't even cover config writes -- config 
writes are non-posted requests too in PCIe.
"Build it, and they will come". That means that drivers would expect 
-ENODEV when a device is gone. If we have that infrastructure, more 
drivers will start using it over time, and it's something that can also 
be used by generic parts of the PCI code. That also means you need a 
generic mechanism to determine a device is bye-bye, and that's what 
PCI_DEV_DISCONNECTED gives you.

> See this thread:
> https://www.spinics.net/lists/linux-acpi/msg81445.html

The discussion is based on the wrong assumptions that reads are 
symmetrical wrt to a device being present or not. However, completion 
timeouts and unsupported requests blow that assumption right out of the 
water. Best case scenario, you just waste a little more time waiting for 
hardware IO. More common is to end up crashing the machine.

Greg's ideas work in a perfect world where PCI and PCIe are equivalent 
at every level. And in that case, PCI_DEV_DISCONNECTED would have been 
pure, 100% genuine Redmond bloatware. We wouldn't have gotten complaints 
from Facebook and other industry players that it takes too damn long to 
remove a device. We probably also wouldn't be seeing machines crash on 
PCIe removal.

Fun fact: Before PCI_DEV_DISCONNECTED, you could physically swap a 
device before the the teardown path was done with the previous device. 
Figuring out what problems that caused is left as an exercise to the reader.

>>> FWIW, the below is what I had in mind (on top of Bjorn's pci/hotplug
>>> branch).  Does this work for you?
>>
>> This, and another patch (you have been CC'd) solve my problem of
>> crashing during surprise removal. Thanks!
> 
> Ok thanks, I submitted the patch this morning with your Tested-by.

Sweet. Thanks!

> Unfortunately I forgot to cc all your Dell colleagues, sorry.

They'll live. They already noticed it and sent me emails about it.

Alex

> Lukas
>