On 8/20/2018 5:05 PM, Benjamin Herrenschmidt wrote:
On Mon, 2018-08-20 at 09:53 -0600, Keith Busch wrote:
On Mon, Aug 20, 2018 at 09:22:27PM +1000, Benjamin Herrenschmidt wrote:
The main problem with unplug/replug (as I mentioned earlier) is that it
just does NOT work for storage controllers (or similar type of
devices). The links between the storage controller and the mounted
filesystems is lost permanently, you'll most likely have to reboot the
machine.
You probably shouldn't mount raw storage devices if they can be hot
added/removed. There are device mappers for that! :)
This is not about hot adding/removing, it's about error recovery.
And you can't just change DPC device removal. A DPC event triggers
the link down, and that will trigger pciehp to disconnect the subtree
anyway. Having DPC do it too just means you get the same behavior with
or without enabling STLCTL.DLLSC.
This is wrong. EEH can trigger a link down to and we don't remove the
subtree in that case. We allow the drivers to recover.
I have a patch to solve this issue.
https://lkml.org/lkml/2018/8/19/124
Hotplug driver removes the devices on link down events and re-enumerates
on insertion.
I am trying to separate fatal error handling from hotplug.
Ben.