Re: [PATCH v3 2/2] PCI/AER: Enable AER on all PCIe devices supporting it

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/20/22 16:46, Bjorn Helgaas wrote:
On Thu, Jan 20, 2022 at 08:31:31AM +0100, Stefan Roese wrote:
On 1/19/22 11:37, Pali Rohár wrote:

And when you opened this issue with hotplugging, another thing for
followup changes in future is calling pcie_set_ecrc_checking() function
to align ECRC state of newly hotplugged device with "pci=ecrc=..."
cmdline option. As currently it is done only at that function
set_device_error_reporting().

Agreed, this is another area to look into. Not sure if it's okay to
address this, once this patch-set has been accepted (if it will be).

ECRC might be something that could be peeled off first to reduce the
complexity of AER itself.

The ECRC capability and enable bits are in the AER Capability, so I
think it should be moved to pci_aer_init() so it happens for every
device as we enumerate it.

Just that there is no misunderstanding: You are thinking about something
like this:

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 9fa1f97e5b27..5585fefc4d0e 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -387,6 +387,9 @@ void pci_aer_init(struct pci_dev *dev)
pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_ERR, sizeof(u32) * n);

        pci_aer_clear_status(dev);
+
+       /* Enable ECRC checking if enabled and configured */
+       pcie_set_ecrc_checking(dev);
 }

 void pci_aer_exit(struct pci_dev *dev)
@@ -1223,9 +1226,6 @@ static int set_device_error_reporting(struct pci_dev *dev, void *data)
                        pci_disable_pcie_error_reporting(dev);
        }

-       if (enable)
-               pcie_set_ecrc_checking(dev);
-
        return 0;
 }

Perhaps as patch 1/3 in this patch series? Or as some completely
separate patch?

Thanks,
Stefan

As far as I can tell, there is no requirement that every device in the
path support ECRC, so it can be enabled independently for each device.
I think devices that don't support ECRC checking must handle TLPs with
ECRC without error.

Per Table 6-5, ECRC check failures result in a device logging the
prefix/header of the TLP and sending ERR_NONFATAL or ERR_COR.  I think
this is useful regardless of whether AER interrupts are enabled
because error information is logged where the ECRC failure was
detected.

Bjorn


Viele Grüße,
Stefan Roese

--
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-51 Fax: (+49)-8142-66989-80 Email: sr@xxxxxxx



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux