On Thu, 09 May 2024 13:08:20 +0100, Anup Patel <apatel@xxxxxxxxxxxxxxxx> wrote: > > Some of the PCI host controllers (such as generic PCI host controller) > use "interrupt-map" DT property to describe the mapping between PCI > endpoints and PCI interrupt pins. This is the only case where the > interrupts are not described in DT. > > Currently, there is no fw_devlink created based on "interrupt-map" > DT property so interrupt controller is not guaranteed to be probed > before the PCI host controller. This affects every platform where > both PCI host controller and interrupt controllers are probed as > regular platform devices. > > This creates fw_devlink between consumers (PCI host controller) and > supplier (interrupt controller) based on "interrupt-map" DT property. > > Signed-off-by: Anup Patel <apatel@xxxxxxxxxxxxxxxx> > Reviewed-by: Saravana Kannan <saravanak@xxxxxxxxxx> > --- > Changes since v3: > - Added a comment about of_irq_parse_raw() > - Removed redundant NULL assignments to sup_args.np > Changes since v2: > - No need for a loop to find #interrupt-cells property value > - Fix node de-reference leak when index is greater than number > of entries in interrupt-map property > Changes since v1: > - Updated commit description based on Rob's suggestion > - Use of_irq_parse_raw() for parsing interrupt-map DT property This patch breaks badly on my M1 Mini, with a continuous stream of boot time warnings lasting about 100 seconds: [ 97.832335] ------------[ cut here ]------------ [ 97.836955] /soc/pcie@690000000/pci@2,0 interrupt-map failed, using interrupt-controller [ 97.845072] WARNING: CPU: 0 PID: 1 at drivers/of/irq.c:277 of_irq_parse_raw+0x620/0x730 [ 97.853087] Modules linked in: [ 97.856139] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.10.0-rc1 #2915 [ 97.864163] Hardware name: Apple Mac mini (M1, 2020) (DT) [ 97.869570] pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 97.876546] pc : of_irq_parse_raw+0x620/0x730 [ 97.880907] lr : of_irq_parse_raw+0x620/0x730 [ 97.885267] sp : ffffc000800538a0 [ 97.888581] x29: ffffc000800538a0 x28: 0000000000000000 x27: ffffffffff5ffc68 [ 97.895732] x26: ffffc000800539a4 x25: ffffffffff5ffbbc x24: ffffc00080053a48 [ 97.902883] x23: ffffe3ff73284e68 x22: ffffb7889aede6d0 x21: ffffe3ff735f9320 [ 97.910034] x20: ffffc00080053964 x19: ffffb7889aede6d0 x18: ffffffffffffffff [ 97.917185] x17: 7075727265746e69 x16: 20676e697375202c x15: 64656c6961662070 [ 97.924336] x14: 616d2d7470757272 x13: 72656c6c6f72746e x12: 6f632d7470757272 [ 97.931487] x11: 65746e6920676e69 x10: ffffe3ff740bcb68 x9 : ffffe3ff72335b38 [ 97.938639] x8 : 00000001000028a4 x7 : ffffe3ff740afc08 x6 : 00000000000038a4 [ 97.945789] x5 : 00000000000067b0 x4 : c0000001000028a4 x3 : 0000000000000000 [ 97.952940] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffb784c16ade00 [ 97.960092] Call trace: [ 97.962534] of_irq_parse_raw+0x620/0x730 [ 97.966545] parse_interrupt_map+0xfc/0x188 [ 97.970731] of_fwnode_add_links+0x170/0x1e0 [ 97.975004] fw_devlink_parse_fwtree+0x44/0x98 [ 97.979452] fw_devlink_parse_fwtree+0x6c/0x98 [ 97.983899] fw_devlink_parse_fwtree+0x6c/0x98 [ 97.988347] device_add+0x610/0x6a8 [ 97.991836] of_device_add+0x4c/0x70 [ 97.995411] of_platform_device_create_pdata+0xa0/0x160 [ 98.000644] of_platform_bus_create+0x184/0x370 [ 98.005178] of_platform_populate+0x68/0x160 [ 98.009451] of_platform_default_populate_init+0xf4/0x118 [ 98.014859] do_one_initcall+0x4c/0x320 [ 98.018695] do_initcalls+0xf4/0x1d8 [ 98.022271] kernel_init_freeable+0x12c/0x280 [ 98.026632] kernel_init+0x2c/0x1f8 [ 98.030120] ret_from_fork+0x10/0x20 [ 98.033696] ---[ end trace 0000000000000000 ]--- which comes from 10a20b34d735f ("of/irq: Don't ignore interrupt-controller when interrupt-map failed"). Each of the 3 PCIe ports are described as such: port02: pci@2,0 { device_type = "pci"; reg = <0x1000 0x0 0x0 0x0 0x0>; reset-gpios = <&pinctrl_ap 33 GPIO_ACTIVE_LOW>; #address-cells = <3>; #size-cells = <2>; ranges; interrupt-controller; #interrupt-cells = <1>; interrupt-map-mask = <0 0 0 7>; interrupt-map = <0 0 0 1 &port02 0 0 0 0>, <0 0 0 2 &port02 0 0 0 1>, <0 0 0 3 &port02 0 0 0 2>, <0 0 0 4 &port02 0 0 0 3>; status = "disabled"; }; and get probed *972 times*, which seem... excessive, given that there are only 4 entries per port. > --- > drivers/of/property.c | 52 +++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 52 insertions(+) > > diff --git a/drivers/of/property.c b/drivers/of/property.c > index a6358ee99b74..2d749a18b037 100644 > --- a/drivers/of/property.c > +++ b/drivers/of/property.c > @@ -1311,6 +1311,57 @@ static struct device_node *parse_interrupts(struct device_node *np, > return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np; > } > > +static struct device_node *parse_interrupt_map(struct device_node *np, > + const char *prop_name, int index) > +{ > + const __be32 *imap, *imap_end, *addr; > + struct of_phandle_args sup_args; > + u32 addrcells, intcells; > + int i, imaplen; > + > + if (!IS_ENABLED(CONFIG_OF_IRQ)) > + return NULL; > + > + if (strcmp(prop_name, "interrupt-map")) > + return NULL; > + > + if (of_property_read_u32(np, "#interrupt-cells", &intcells)) > + return NULL; > + addrcells = of_bus_n_addr_cells(np); > + > + imap = of_get_property(np, "interrupt-map", &imaplen); > + if (!imap || imaplen <= (addrcells + intcells)) This is "interesting". You compare a number of *bytes* with a number of cells. Only off by a factor of 4... Also, you need a minimum of one extra cell to hold the phandle, and a yet unknown number of cells for whatever follows the phandle. > + return NULL; > + imap_end = imap + imaplen; Same problem, with pointer arithmetic this time. > + > + while (imap < imap_end) { > + addr = imap; > + imap += addrcells; > + > + sup_args.np = np; > + sup_args.args_count = intcells; > + for (i = 0; i < intcells; i++) > + sup_args.args[i] = be32_to_cpu(imap[i]); > + imap += intcells; > + > + /* > + * Upon success, the function of_irq_parse_raw() returns > + * interrupt controller DT node pointer in sup_args.np. > + */ > + if (of_irq_parse_raw(addr, &sup_args)) > + return NULL; > + > + if (!index) > + return sup_args.np; > + > + of_node_put(sup_args.np); > + imap += sup_args.args_count + 1; This really doesn't map (pun intended) to the way the interrupt-map entries are built. You need to account for the parent address size as well. I'll post a patch fixing both issues. M. -- Without deviation from the norm, progress is not possible.