On 8/29/2018 8:01 PM, Keith Busch wrote:
On Wed, Aug 22, 2018 at 09:06:57AM +1000, Benjamin Herrenschmidt wrote:
It can be probably done by a simple test & skip as you go down
restoring state, then handling the removals after the dance is
complete.
I tested on a variety of hardware, and there are mixed results. The spec
captures the crux of the problem with checking PDC (7.5.3.11):
Note that the in-band presence detect mechanism requires that power be
applied to an adapter for its presence to be detected. Consequently,
form factors that require a power controller for hot-plug must implement
a physical pin presence detect mechanism.
Many slots don't implement power controllers, so a secondary bus reset
always triggers a PDC. We can't really ignore PDC during fatal error
handling since hot plugs are the types of actions that often trigger
fatal errors..
Does it sound okay to trust PDC anyway? It's no worse than what would
happen currently, and it doesn't affect non-hotplug slots.
Why does hotplug operations cause a fatal error? DPC driver is only monitoring
fatal errors today.