On Tue, Jan 23, 2024 at 09:59:21AM -0600, Bjorn Helgaas wrote: > On Mon, Jan 22, 2024 at 06:37:48PM -0800, Kuppuswamy Sathyanarayanan wrote: > > On 1/22/24 11:32 AM, Bjorn Helgaas wrote: > > > On Mon, Jan 08, 2024 at 05:15:08PM -0700, Matthew W Carlis wrote: > > >> A small part is probably historical; we've been using DPC on PCIe > > >> switches since before there was any EDR support in the kernel. It > > >> looks like there was a PCIe DPC ECN as early as Feb 2012, but this > > >> EDR/DPC fw ECN didn't come in till Jan 2019 & kernel support for ECN > > >> was even later. Its not immediately clear I would want to use EDR in > > >> my newer architecures & then there are also the older architecures > > >> still requiring support. When I submitted this patch I came at it > > >> with the approach of trying to keep the old behavior & still support > > >> the newer EDR behavior. Bjorns patch from Dec 28 2023 would seem to > > >> change the behavior for both root ports & switch ports, requiring > > >> them to set _OSC Control Field bit 7 (DPC) and _OSC Support Field > > >> bit 7 (EDR) or a kernel command line value. I think no matter what, > > >> we want to ensure that PCIe Root Ports and PCIe switches arrive at > > >> the same policy here. > > > Is there an approved DPC ECN to the PCI Firmware spec that adds DPC > > > control negotiation, but does *not* add the EDR requirement? > > > > > > I'm looking at > > > https://members.pcisig.com/wg/PCI-SIG/document/previewpdf/12888, which > > > seems to be the final "Downstream Port Containment Related > > > Enhancements" ECN, which is dated 1/28/2019 and applies to the PCI > > > Firmware spec r3.2. > > > > > > It adds bit 7, "PCI Express Downstream Port Containment Configuration > > > control", to the passed-in _OSC Control field, which indicates that > > > the OS supports both "native OS control and firmware ownership models > > > (i.e. Error Disconnect Recover notification) of Downstream Port > > > Containment." > > > > > > It also adds the dependency that "If the OS sets bit 7 of the Control > > > field, it must set bit 7 of the Support field, indicating support for > > > the Error Disconnect Recover event." > > > > > > So I'm trying to figure out if the "support DPC but not EDR" situation > > > was ever a valid place to be. Maybe it's a mistake to have separate > > > CONFIG_PCIE_DPC and CONFIG_PCIE_EDR options. > > > > My understanding is also similar. I have raised the same point in > > https://lore.kernel.org/all/3c02a6d6-917e-486c-ad41-bdf176639ff2@xxxxxxxxxxxxxxx/ > > Ah, sorry, I missed that. > > > IMO, we don't need a separate config for EDR. I don't think user can > > gain anything with disabling EDR and enabling DPC. As long as > > firmware does not user EDR support, just compiling the code should > > be harmless. > > > > So we can either remove it, or select it by default if user selects > > DPC config. > > > > > CONFIG_PCIE_EDR depends on CONFIG_ACPI, so the situation is a little > > > bit murky on non-ACPI systems that support DPC. > > > > If we are going to remove the EDR config, it might need #ifdef > > CONFIG_ACPI changes in edr.c to not compile ACPI specific code. > > Alternative choice is to compile edr.c with CONFIG_ACPI. > > Right. I think we should probably remove CONFIG_PCIE_EDR completely > and make everything controlled by CONFIG_PCIE_DPC. In the PCI Firmware spec, r3.3, sec 4.5.1, table 4-4, the description of "Error Disconnect Recover Supported" hints at the possibility for an OS to support EDR but not DPC: In the context of PCIe, support for Error Disconnect Recover implies that the operating system will invalidate the software state associated with child devices of the port without attempting to access the child device hardware. *If* the operating system supports Downstream Port Containment (DPC), ..., the operating system shall attempt to recover the child devices if the port implements the Downstream Port Containment Extended Capability. On the other hand, sec 4.6.12 and the implementation note there with the EDR flow seem to assume the OS *does* support DPC and can clear error status and bring ports out of DPC even if the OS hasn't been granted control of DPC. EDR is an ACPI feature, and I guess it might be theoretically possible to use EDR on an ACPI system with some non-DPC error containment feature like powerpc EEH. But obviously powerpc doesn't use ACPI, and a hypothetical ACPI system with non-DPC error containment would have to add support for that containment in edr_handle_event(). So while I'm not 100% sure that making CONFIG_PCIE_DPC control both DPC and EDR is completely correct, I guess for now I still think using CONFIG_PCIE_DPC for both DPC and EDR seems like a reasonable plan because we have no support for EDR *without* DPC. Bjorn