Hi Bjorn, > Am 08.03.2023 um 19:49 schrieb Bjorn Helgaas <helgaas@xxxxxxxxxx>: > > On Tue, Feb 28, 2023 at 09:43:54AM +0100, H. Nikolaus Schaller wrote: >> commit bb38919ec56e ("PCI: imx6: Add support for i.MX6 PCIe controller") >> added a fault hook to this driver in the probe function. So it was only >> installed if needed. >> >> commit bde4a5a00e76 ("PCI: imx6: Allow probe deferral by reset GPIO") >> moved it from probe to driver init which installs the hook unconditionally >> as soon as the driver is compiled into a kernel. >> >> When this driver is compiled as a module, the hook is not registered >> until after the driver has been matched with a .compatible and >> loaded. >> >> commit 415b6185c541 ("PCI: imx6: Fix config read timeout handling") >> extended the fault handling code. >> >> commit 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ") >> added some protection for non-ARM architectures, but this does not >> protect non-i.MX ARM architectures. > > Are *all* these commits relevant? Yes, it was correct when introduced by commit bb38919ec56e for a goo reason. And it was broken by bde4a5a00e76 an all attempts later made it worse. > Question also applies to Fixes: > below. It fixes all between bde4a5a00e76 and HEAD. Well, one can argue that commit bde4a5a00e76 could be sufficient for Fixes: I don't know if it is a problem because I have no overview over side-effects. > >> Since fault handlers can be triggered on any architecture for different >> reasons, there is no guarantee that they will be triggered only for the >> assumed situation, leading to improper error handling (i.MX6-specific >> imx6q_pcie_abort_handler) on foreign systems. >> >> I had seen strange L3 imprecise external abort messages several times on >> OMAP4 and OMAP5 devices and couldn't make sense of them until I realized >> they were related to this unused imx6q driver because I had >> CONFIG_PCI_IMX6=y. > > Apparently imx6q_pcie_abort_handler() assumes it is always called > because of a PCI abort? If so, that sounds problematic. > > If non-PCI imprecise aborts happen on OMAP4 and OMAP5 where imx6q is > unused and imx6q_pcie_abort_handler() is not appropriate, I assume > similar non-PCI aborts can also happen on systems where imx6q *is* > used. As far as I know the reasons why imprecise aborts occur may be SoC specific. So I have no experience with i.MX6 to judge this. My goal is to shield other architectures from this fault handler may it be correct or wrong. > So imx6q_pcie_abort_handler() may be trying to fixup non-PCI aborts > when it shouldn't? Yes, at least if it is triggered on OMAP4/OMAP5 by accessing non-existing registers in some subsystems (e.g. through devmem2). > >> Note that CONFIG_PCI_IMX6=y is useful for kernel binaries that are designed >> to run on different ARM SoC and be differentiated only by device tree >> binaries. So turning off CONFIG_PCI_IMX6 is not a solution. >> >> Therefore we check the compatible in the init function before registering >> the fault handler. >> >> Fixes: bde4a5a00e76 ("PCI: imx6: Allow probe deferral by reset GPIO") >> Fixes: 415b6185c541 ("PCI: imx6: Fix config read timeout handling") >> Fixes: 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ") >> >> Signed-off-by: H. Nikolaus Schaller <hns@xxxxxxxxxxxxx> >> --- >> drivers/pci/controller/dwc/pci-imx6.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c >> index 1dde5c579edc8..89774aa187ae8 100644 >> --- a/drivers/pci/controller/dwc/pci-imx6.c >> +++ b/drivers/pci/controller/dwc/pci-imx6.c >> @@ -1402,6 +1402,15 @@ DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_SYNOPSYS, 0xabcd, >> static int __init imx6_pcie_init(void) >> { >> #ifdef CONFIG_ARM >> + const struct of_device_id *reboot_id; >> + struct device_node *np; >> + >> + np = of_find_matching_node_and_match(NULL, imx6_pcie_of_match, >> + &reboot_id); > > Since you don't need reboot_id, I think you should use > of_find_matching_node() instead. Well, I used it for debugging, but for production code it has indeed no benefit. of_find_matching_node it is just a static inline wrapper for of_find_matching_node_and_match with NULL parameter, but we can save one stack position. I'll send a v2 soon. > >> + if (!np) >> + return -ENODEV; >> + of_node_put(np); >> + >> /* >> * Since probe() can be deferred we need to make sure that >> * hook_fault_code is not called after __init memory is freed >> -- >> 2.38.1 >> BR and thanks, Nikolaus