Re: [PATCH 1/1] PCI/AER: prevent pcie_do_fatal_recovery from using device after it is removed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2018-08-21 at 16:04 -0600, Keith Busch wrote:
> > I think there need to be some coordination between pciehb and DPC on
> > link state change yes.
> > 
> > We could still remove the device if recovery fails. For example on EEH
> > we have a threshold and if a device fails more than N times within the
> > last M minutes (I don't remember the exact numbers and I'm not in front
> > of the code right now) we give up.
> > 
> > Also under some circumstances, the link will change as a result of the
> > error handling doing a hot reset.
> > 
> > For example, if the error happens in a parent bridge and that gets
> > reset, the entire hierarchy underneath does too.
> > 
> > We need to save/restore the state of all bridges and devices (BARs
> > etc...) below that.
> 
> That's not good enough. You'd at least need to check SLTSTS.PDC on all
> bridge DSP's beneath the parent *before* you try to restore the state
> of devices beneath them. Those races are primarily why DPC currently
> removes and reenumerates instead, because that's not something that can
> be readily coordinated today.

It can be probably done by a simple test & skip as you go down
restoring state, then handling the removals after the dance is
complete.

Cheers,
Ben.





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux