On 3/11/2020 4:27 PM, Kuppuswamy Sathyanarayanan wrote: > > [EXTERNAL EMAIL] > > Hi, > > On 3/11/20 1:33 PM, Bjorn Helgaas wrote: >> On Wed, Mar 11, 2020 at 05:27:35PM +0000, Austin.Bolen@xxxxxxxx wrote: >>> On 3/11/2020 12:12 PM, Bjorn Helgaas wrote: >>>> [EXTERNAL EMAIL] >>>> >>> <SNIP> >>>> I'm probably missing your intent, but that sounds like "the OS can >>>> read/write AER bits whenever it wants, regardless of ownership." >>>> >>>> That doesn't sound practical to me, and I don't think it's really >>>> similar to DPC, where it's pretty clear that the OS can touch DPC bits >>>> it doesn't own but only *during the EDR processing window*. >>> Yes, by treating AER bits like DPC bits I meant I'd define the specific >>> time windows when OS can touch the AER status bits similar to how it's >>> done for DPC in the current ECN. >> Makes sense, thanks. >> >>>>>>> For the normative text describing when OS clears the AER bits >>>>>>> following the informative flow chart, it could say that OS clears >>>>>>> AER as soon as possible after OST returns and before OS processes >>>>>>> _HPX and loading drivers. Open to other suggestions as well. >>>>>> I'm not sure what to do with "as soon as possible" either. That >>>>>> doesn't seem like something firmware and the OS can agree on. >>>>> I can just state that it's done after OST returns but before _HPX or >>>>> driver is loaded. Any time in that range is fine. I can't get super >>>>> specific here because different OSes do different things. Even for >>>>> a given OS they change over time. And I need something generic >>>>> enough to support a wide variety of OS implementations. >>>> Yeah. I don't know how to solve this. >>>> >>>> Linux doesn't actually unload and reload drivers for the child devices >>>> (Sathy, correct me if I'm wrong here) even though DPC containment >>>> takes the link down and effectively unplugs and replugs the device. I >>>> would *like* to handle it like hotplug, but some higher-level software >>>> doesn't deal well with things like storage devices disappearing and >>>> reappearing. >>>> >>>> Since Linux doesn't actually re-enumerate the child devices, it >>>> wouldn't evaluate _HPX again. It would probably be cleaner if it did, >>>> but it's all tied up with the whole unplug/replug problem. >>> DPC resets everything below it and so to get it back up and running it >>> would mean that all buses and resources need to be assigned, _HPX >>> evaluated, and drivers reloaded. If those things don't happen then the >>> whole hierarchy below the port that triggered DPC will be inaccessible. >> Hmm, I think I might be confusing this with another situation. Sathy, >> can you help me understand this? I don't have a way to actually >> exercise this EDR path. Is there some way the pciehp hotplug driver >> gets involved here? If the port has hot-plug enabled then DPC trigger will cause the link to go down (disabled state) and will generate a DLLSC hot-plug interrupt. When DPC is released, the link will become active and generate another DLLSC hot-plug interrupt. >> >> Here's how this seems to work as far as I can tell: >> >> - Linux does not have DPC or AER control >> >> - Linux installs EDR notify handler >> >> - Linux evaluates DPC Enable _DSM >> >> - DPC containment event occurs >> >> - Firmware fields DPC interrupt >> >> - DPC event is not a surprise remove >> >> - Firmware sends EDR notification >> >> - Linux EDR notify handler evaluates Locate _DSM >> >> - Linux reads and logs DPC and AER error information for port in >> containment mode. [If it was an RP PIO error, Linux clears RP PIO >> error status, which is an asymmetry with the non-RP PIO path.] >> >> - Linux clears AER error status (pci_aer_raw_clear_status()) >> >> - Linux calls driver .error_detected() methods for all child devices >> of the port in containment mode (pcie_do_recovery()). These >> devices are inaccessible because the link is down. >> >> - Linux clears DPC Trigger Status (dpc_reset_link() from >> pcie_do_recovery()). >> >> - Linux calls driver .mmio_enabled() methods for all child devices. >> >> This is where I get lost. These child devices are now accessible, but >> they've been reset, so I don't know how their config space got >> restored. Did pciehp enumerate them? Did we do something like >> pci_restore_state()? I don't see where either of these happens. > AFAIK, AER error status registers are sticky (RW1CS) and hence > will be preserved during reset. In our testing, the device directly connected to the port that was contained does get reprogrammed and the driver is reloaded. These are hot-plug slots and so might be due to DLLSC hot-plug interrupt when containment is released and link goes back to active state. However, if a switch is connected to the port where DPC was triggered then we do not see the whole switch hierarchy being re-enumerated. Also, DPC could be enabled on non-hot-plug slots so can't always rely on hot-plug to re-init devices in the recovery path. >> >> So they want to basically do native AER handling even though firmware >> owns AER? My head hurts. > No, Its meant only for clearing AER registers. In EDR path, since > OS owns clearing DPC registers, they want to let OS own clearing AER > registers as well. Also, it would give OS a chance to decide whether > we want to keep the device on based on error status and history of the > device attached. Right. The way it was pitched to me was that the OSVs wanted to read/clear the error status bits so they could re-use the code that does that when OS natively owns AER/DPC. >> >> Bjorn >