Re: [PATCH 1/1] PCI/AER: prevent pcie_do_fatal_recovery from using device after it is removed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2018-08-21 20:07, Keith Busch wrote:
On Tue, Aug 21, 2018 at 04:06:30PM +1000, Benjamin Herrenschmidt wrote:
On Tue, 2018-08-21 at 10:44 +0530, poza@xxxxxxxxxxxxxx wrote:
>
> Ok Let me summarize the so far discussed things.
>
> It would be nice if we all (Bjorn, Keith, Ben, Sinan) can hold consensus
> on this.
>
> 1) Right now AER and DPC both calls pcie_do_fatal_recovery(), I majorly
> see DPC as error handling and recovery agent rather than being used for
> hotplug.
>     so in my opinion, both AER and DPC should have same error handling
> and recovery mechanism

Yes.

>     so if there is a way to figure out that in absence of pcihp, if DPC
> is being used to support hotplug then we fall back to original DPC
> mechanism (which is remove devices)

Not exactly. If the presence detect change indicates it was a hotplug
event rather.

The actions associated with error recovery will trigger link state changes
for a lot of existing hardware. PCIEHP currently does the same removal
sequence for both link state change (DLLSC) and presence detect change
(PDC) events.

It sounds like you want pciehp to do nothing on the DLLSC events that it currently handles, and instead do the board removal only on PDC. If that is the case, is the desire to not remove devices downstream a permanently disabled link, or does that responsibility fall onto some other component?

Keith

Are you in agreement with following ?

"
Right now AER and DPC both calls pcie_do_fatal_recovery(), I majorly see DPC as error handling and recovery agent rather than being used for hotplug. so in my opinion, both AER and DPC should have same error handling and recovery mechanism

so if there is a way to figure out that in absence of pcihp, if DPC is being used to support hotplug then we fall back to original DPC mechanism (which is remove devices)
   otherwise, we fall back to drivers callbacks.

   Spec 6.7.5 Async Removal
   "
The Surprise Down error resulting from async removal may trigger Downstream Port Containment (See Section 6.2.10). Downstream Port Containment following an async removal may be utilized to hold the Link of a Downstream Port in the Disabled LTSSM state while host software recovers from the side effects of an async removal.
   "

I think above is implementation specific. but there has to be some way to kow if we are using DPC for hotplug or not !
   otherwise it is assumed to be used for error handling and recovery

pcie_do_fatal_recovery should take care of above. so that we support both error handling and async removal from DPC point of view.
"








[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux