Re: [PATCH 1/1] PCI/AER: prevent pcie_do_fatal_recovery from using device after it is removed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 21, 2018 at 04:04:56PM -0600, Keith Busch wrote:
> > For example, if the error happens in a parent bridge and that gets
> > reset, the entire hierarchy underneath does too.
> > 
> > We need to save/restore the state of all bridges and devices (BARs
> > etc...) below that.
> 
> That's not good enough. You'd at least need to check SLTSTS.PDC on all
> bridge DSP's beneath the parent *before* you try to restore the state
> of devices beneath them. Those races are primarily why DPC currently
> removes and reenumerates instead, because that's not something that can
> be readily coordinated today.

A picture of a real scenario to go with the above comments:

                    ----------------
  --------          |              |
  |      |       -------        -------       --------------
  |  RP  | <---> | USP | SWITCH | DSP | <---> | END DEVICE |
  |      |       ------         -------       --------------
  --------          |              |
                    ----------------

The downstream port (DSP) is hotplug capable, and the root port (RP)
has DPC enabled.

Lets say you hot swap END DEVICE at a time with an inflight posted
transaction such that the RP triggers a CTO, starting a DPC event.
If you treat the RP's DPC event as just "error handling" and restore
the state after containment is released, we are in undefined behavior
because of the mistaken idententy of the device below the SWITCH DSP.

Re-enumerating the topology is easy since it has no races like that. I
understand re-enumeration is problematic for some scenarios, so if we
really need to avoid re-enumeration except when absolutely necessary,
it should be possible if we can detangle the necessary coordination
between the port service drivers.



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux