On 8/31/2021 2:20 PM, Bjorn Helgaas wrote:
On Wed, May 26, 2021 at 08:12:04PM -0500, stuart hayes wrote:
...
I made the patch because it was causing the config space for a downstream
port to not get restored when a DPC event occurred, and all the NVMe drives
under it disappeared. I found that myself, though--I'm not aware of anyone
else reporting the issue.
This niggles at me. IIUC the problem you're reporting is that portdrv
didn't claim a port because portdrv incorrectly assumed the port
supported bandwidth notification interrupts. That's all fine, and I
think this is a good fix.
But why should it matter whether portdrv claims the port? What if
CONFIG_PCIEPORTBUS isn't even enabled? I guess CONFIG_PCIE_DPC
wouldn't be enabled then either.
In your situation, you have CONFIG_PCIEPORTBUS=y and (I assume)
CONFIG_PCIE_DPC=y. I guess you must have two levels of downstream
ports, e.g.,
Root Port -> Switch Upstream Port -> Switch Downstream Port -> NVMe
and portdrv claimed the Root Port and you enabled DPC there, but it
didn't claim the Switch Downstream Port?
That's correct. On the system I was using, there was another layer of
upstream/downstream ports, but I don't think that matters... I had:
Root Port -> Switch Upstream Port (portdrv claimed) -> Switch Downstream
Port (portdrv did NOT claim) -> Switch Upstream Port (portdrv claimed)
-> Switch Downstream Port (portdrv claimed) -> NVMe
The failure to restore config space because portdrv didn't claim the
port seems wrong to me.
When a DCP event is triggered on the root port, the downstream devices
get reset, and portdrv is what restores the switch downstream port's
config space (in pcie_portdrv_slot_reset).
So if portdrv doesn't claim the downstream port, the config space
doesn't get restored at all, so it won't forward anything to subordinate
buses, and everything below the port disappears once the DPC event happens.
I'm not really sure how else it would recover from a DPC event, I guess.
Bjorn