On Tue, Dec 08, 2020 at 12:46:27PM -0600, Bjorn Helgaas wrote: > On Tue, Dec 08, 2020 at 07:05:09PM +0100, Marek Vasut wrote: > > On 12/8/20 5:40 PM, Bjorn Helgaas wrote: > > > > > +static const struct of_device_id rcar_pcie_abort_handler_of_match[] __initconst = { > > > > + { .compatible = "renesas,pcie-r8a7779" }, > > > > + { .compatible = "renesas,pcie-r8a7790" }, > > > > + { .compatible = "renesas,pcie-r8a7791" }, > > > > + { .compatible = "renesas,pcie-rcar-gen2" }, > > > > + {}, > > > > +}; > > > > > > Why do we need another copy of these, as opposed to doing something > > > with of_device_get_match_data(), e.g., like brcm_pcie_probe() does? > > > > This is not a copy, but as subset of SoCs which are affected by this > > problem. > > I know it's not a complete copy. Many systems include flags like > "broken_l1" in their match_data. Something like this: > > struct rcar_pcie_drvdata { > int (*phy_init_fn)(struct rcar_pcie_host *host); > unsigned int broken_l1:1; > }; > > static const struct rcar_pcie_drvdata rcar_init_h1_drvdata = { > .phy_init_fn = rcar_pcie_phy_init_h1, > .broken_l1 = 1, > }; > > static const struct rcar_pcie_drvdata rcar_init_gen2_drvdata = { > .phy_init_fn = rcar_pcie_phy_init_gen2, > .broken_l1 = 1, > }; > > static const struct rcar_pcie_drvdata rcar_init_gen3_drvdata = { > .phy_init_fn = rcar_pcie_phy_init_gen3, > }; > > static const struct of_device_id rcar_pcie_of_match[] = { > { .compatible = "renesas,pcie-r8a7779", .data = rcar_init_h1_drvdata }, > { .compatible = "renesas,pcie-r8a7790", .data = rcar_init_gen2_drvdata }, > { .compatible = "renesas,pcie-r8a7791", .data = rcar_init_gen2_drvdata }, > ... +1 > > > > +static int __init rcar_pcie_init(void) > > > > +{ > > > > + if (of_find_matching_node(NULL, rcar_pcie_abort_handler_of_match)) { > > > > +#ifdef CONFIG_ARM_LPAE > > > > + hook_fault_code(17, rcar_pcie_aarch32_abort_handler, SIGBUS, 0, > > > > + "asynchronous external abort"); > > > > +#else > > > > + hook_fault_code(22, rcar_pcie_aarch32_abort_handler, SIGBUS, 0, > > > > + "imprecise external abort"); > > > > +#endif > > > > + } > > > > + > > > > + return platform_driver_register(&rcar_pcie_driver); > > > > +} > > > > +device_initcall(rcar_pcie_init); > > > > +#else > > > > builtin_platform_driver(rcar_pcie_driver); > > > > +#endif > > > > > > Is the device_initcall() vs builtin_platform_driver() something > > > related to the hook_fault_code()? What would break if this were > > > always builtin_platform_driver()? > > > > rcar_pcie_init() would not be called before probe. > > Sorry to be slow, but why does it need to be called before probe? > Obviously software isn't putting the controller in D3 or enabling ASPM > before probe. I don't understand it either so it would be good to clarify. Also, some of these platforms are SMP systems, I don't understand what prevents multiple cores to fault at once given that the faults can happen for config/io/mem accesses alike. I understand that the immediate fix is for S2R, that is single threaded but I would like to understand how comprehensive this fix is. Thanks, Lorenzo