On Sat, Nov 28, 2020 at 08:32:32PM -0800, Kuppuswamy, Sathyanarayanan wrote: > On 11/28/20 3:25 PM, Bjorn Helgaas wrote: > > On Sat, Nov 28, 2020 at 01:56:23PM -0800, Kuppuswamy, Sathyanarayanan wrote: > > > On 11/28/20 1:53 PM, Bjorn Helgaas wrote: > > > > On Sat, Nov 28, 2020 at 01:49:46PM -0800, Kuppuswamy, Sathyanarayanan wrote: > > > > > On 11/28/20 12:24 PM, Bjorn Helgaas wrote: > > > > > > On Wed, Nov 25, 2020 at 06:01:57PM -0800, Kuppuswamy, Sathyanarayanan wrote: > > > > > > > On 11/25/20 5:18 PM, Bjorn Helgaas wrote: > > > > > > > > From: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > > > > > > > > > > > > > > > Downstream Ports may support DPC regardless of whether they support AER > > > > > > > > (see PCIe r5.0, sec 6.2.10.2). Previously, if the user booted with > > > > > > > > "pcie_ports=dpc-native", it was possible for dpc_probe() to succeed even if > > > > > > > > the device had no AER Capability, but dpc_get_aer_uncorrect_severity() > > > > > > > > depends on the AER Capability. > > > > > > > > > > > > > > > > dpc_probe() previously failed if: > > > > > > > > > > > > > > > > !pcie_aer_is_native(pdev) && !pcie_ports_dpc_native > > > > > > > > !(pcie_aer_is_native() || pcie_ports_dpc_native) # by De Morgan's law > > > > > > > > > > > > > > > > so it succeeded if: > > > > > > > > > > > > > > > > pcie_aer_is_native() || pcie_ports_dpc_native > > > > > > > > > > > > > > > > Fail dpc_probe() if the device has no AER Capability. > > > > > > > > > > > > > > > > Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > > > > > > > Cc: Olof Johansson <olof@xxxxxxxxx> > > > > > > > > --- > > > > > > > > drivers/pci/pcie/dpc.c | 3 +++ > > > > > > > > 1 file changed, 3 insertions(+) > > > > > > > > > > > > > > > > diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c > > > > > > > > index e05aba86a317..ed0dbc43d018 100644 > > > > > > > > --- a/drivers/pci/pcie/dpc.c > > > > > > > > +++ b/drivers/pci/pcie/dpc.c > > > > > > > > @@ -287,6 +287,9 @@ static int dpc_probe(struct pcie_device *dev) > > > > > > > > int status; > > > > > > > > u16 ctl, cap; > > > > > > > > + if (!pdev->aer_cap) > > > > > > > > + return -ENOTSUPP; > > > > > > > Don't we check aer_cap support in drivers/pci/pcie/portdrv_core.c ? > > > > > > > > > > > > > > We don't enable DPC service, if AER service is not enabled. And AER > > > > > > > service is only enabled if AER capability is supported. > > > > > > > > > > > > > > So dpc_probe() should not happen if AER capability is not supported? > > > > > > > > > > > > I don't think that's always true. If I'm reading this right, we have > > > > > > this: > > > > > > > > > > > > get_port_device_capability(...) > > > > > > { > > > > > > #ifdef CONFIG_PCIEAER > > > > > > if (dev->aer_cap && ...) > > > > > > services |= PCIE_PORT_SERVICE_AER; > > > > > > #endif > > > > > > > > > > > > if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_DPC) && > > > > > > pci_aer_available() && > > > > > > (pcie_ports_dpc_native || (services & PCIE_PORT_SERVICE_AER))) > > > > > > services |= PCIE_PORT_SERVICE_DPC; > > > > > > } > > > > > > > > > > > > and in the case where: > > > > > > > > > > > > - CONFIG_PCIEAER=y > > > > > > - booted with "pcie_ports=dpc-native" (pcie_ports_dpc_native is true) > > > > > > - "dev" has no AER capability > > > > > > - "dev" has DPC capability > > > > > > > > > > > > I think we do enable PCIE_PORT_SERVICE_DPC. > > > > > Got it. But further looking into it, I am wondering whether > > > > > we should keep this dependency? Currently we just use it to > > > > > dump the error information. Do we need to create dependency > > > > > between DPC and AER (which is functionality not dependent) just > > > > > to see more details about the error? > > > > > > > > That's a good question, but I don't really want to get into the actual > > > > operation of the AER and DPC drivers in this series, so maybe > > > > something we should explore later. > > > > > In that case, can you move this check to > > > drivers/pci/pcie/portdrv_core.c? I don't see the point of > > > distributed checks in both get_port_device_capability() and > > > dpc_probe(). > > > > I totally agree that these distributed checks are terrible, but my > > long-term hope is to get rid of portdrv and handle these "services" > > more like we handle other capabilities. For example, maybe we can > > squash dpc_probe() into pci_dpc_init(), so I'd actually like to move > > things from get_port_device_capability() into dpc_probe(). > Removing the service driver model will be a major overhaul. It would > affect even the error recovery drivers. You can find motivation > for service drivers in Documentation/PCI/pciebus-howto.rst. Part of that motivation for portdrv is to allow multiple service drivers to run simultaneously on the same PCIe port. That is also part of the *problem* with portdrv. There are several PCIe ports that implement device-specific functionality via their BARs, and there is currently no way for drivers to cleanly claim those ports because they're already claimed by portdrv. > But till we fix this part, I recommend grouping all dependency checks > to one place (either dpc_probe() or portdrv service driver). I think they should go in the driver, e.g., dpc_probe(). The typical model is that bus drivers only match a device with the appropriate driver, and the device driver does more specialized checks. For example, the PCI bus driver only looks at the Vendor ID, Device ID, and Class code. But we don't have to solve all of this right now. Bjorn