On Wed, May 29, 2024 at 8:51 AM Rob Herring <robh@xxxxxxxxxx> wrote: > > On Wed, May 29, 2024 at 6:28 AM Anup Patel <apatel@xxxxxxxxxxxxxxxx> wrote: > > > > On Wed, May 29, 2024 at 4:15 PM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > > > > On Wed, 29 May 2024 11:16:30 +0100, > > > Anup Patel <apatel@xxxxxxxxxxxxxxxx> wrote: > > > > > > > > On Wed, May 29, 2024 at 12:03 PM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > > > > > > > > On Wed, 29 May 2024 06:15:52 +0100, > > > > > Anup Patel <apatel@xxxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > On Tue, May 28, 2024 at 10:11 PM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > Commit d976c6f4b32c ("of: property: Add fw_devlink support for > > > > > > > interrupt-map property") tried to do what it says on the tin, > > > > > > > but failed on a couple of points: > > > > > > > > > > > > > > - it confuses bytes and cells. Not a huge deal, except when it > > > > > > > comes to pointer arithmetic > > > > > > > > > > > > > > - it doesn't really handle anything but interrupt-maps that have > > > > > > > their parent #address-cells set to 0 > > > > > > > > > > > > > > The combinations of the two leads to some serious fun on my M1 > > > > > > > box, with plenty of WARN-ON() firing all over the shop, and > > > > > > > amusing values being generated for interrupt specifiers. > > > > > > > > > > > > > > Address both issues so that I can boot my machines again. > > > > > > > > > > > > > > Fixes: d976c6f4b32c ("of: property: Add fw_devlink support for interrupt-map property") > > > > > > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > > > > > > > Cc: Anup Patel <apatel@xxxxxxxxxxxxxxxx> > > > > > > > Cc: Saravana Kannan <saravanak@xxxxxxxxxx> > > > > > > > Cc: Rob Herring (Arm) <robh@xxxxxxxxxx> > > > > > > > > > > > > Thanks for the fix patch but unfortunately it breaks for RISC-V. > > > > > > > > > > > > > --- > > > > > > > drivers/of/property.c | 16 ++++++++++++++-- > > > > > > > 1 file changed, 14 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > diff --git a/drivers/of/property.c b/drivers/of/property.c > > > > > > > index 1c83e68f805b..9adebc63bea9 100644 > > > > > > > --- a/drivers/of/property.c > > > > > > > +++ b/drivers/of/property.c > > > > > > > @@ -1322,7 +1322,13 @@ static struct device_node *parse_interrupt_map(struct device_node *np, > > > > > > > addrcells = of_bus_n_addr_cells(np); > > > > > > > > > > > > > > imap = of_get_property(np, "interrupt-map", &imaplen); > > > > > > > - if (!imap || imaplen <= (addrcells + intcells)) > > > > > > > + imaplen /= sizeof(*imap); > > > > > > > + > > > > > > > + /* > > > > > > > + * Check that we have enough runway for the child unit interrupt > > > > > > > + * specifier and a phandle. That's the bare minimum we can expect. > > > > > > > + */ > > > > > > > + if (!imap || imaplen <= (addrcells + intcells + 1)) > > > > > > > return NULL; > > > > > > > imap_end = imap + imaplen; > > > > > > > > > > > > > > @@ -1346,8 +1352,14 @@ static struct device_node *parse_interrupt_map(struct device_node *np, > > > > > > > if (!index) > > > > > > > return sup_args.np; > > > > > > > > > > > > > > - of_node_put(sup_args.np); > > > > > > > + /* > > > > > > > + * Account for the full parent unit interrupt specifier > > > > > > > + * (address cells, interrupt cells, and phandle). > > > > > > > + */ > > > > > > > + imap += of_bus_n_addr_cells(sup_args.np); > > > > > > > > > > > > This breaks for RISC-V because we don't have "#address-cells" > > > > > > property in interrupt controller DT node and of_bus_n_addr_cells() > > > > > > retrieves "#address-cells" from the parent of interrupt controller. > > > > > > > > > > That's a feature, not a bug. #address-cells, AFAICT, applies to all > > > > > child nodes until you set it otherwise. > > > > > > > > > > > > > > > > > The of_irq_parse_raw() looks for "#address-cells" property > > > > > > in the interrupt controller DT node only so we should do a > > > > > > similar thing here as well. > > > > > > > > > > This looks more like a of_irq_parse_raw() bug than anything else. > > > > > > > > If we change of_irq_parse_raw() to use of_bus_n_addr_cells() > > > > then it would still break for RISC-V. > > > > > > I'm not trying to "fix" riscv. I'm merely outlining that you are > > > relying on both broken DTs and a buggy OS. > > > > > > > > > > > Using of_bus_n_addr_cells() over here forces interrupt controller > > > > DT nodes to have a "#address-cells" DT property. There are many > > > > interrupt controller (e.g. RISC-V PLIC or RISC-V APLIC) where the > > > > DT bindings don't require "#address-cells" DT property and existing > > > > DTS files with such interrupt controllers don't have it either. > > > > > > It forces you to set #address-cells *if you rely on a different > > > value in a child node*. It's not like the semantics are new. > > > > We don't have child nodes under the interrupt controller DT node > > (for both RISC-V PLIC and APLIC) so we certainly don't require the > > "#address-cells" property in the interrupt controller DT node. > > interrupt controller nodes which are referred to by interrupt-map > require #address-cells. Period. Child nodes or not. Now that I've re-read the code, the kernel will treat missing #address-cells in the interrupt parent as 0. Looking in the parent nodes for #address-cells only happens for the input address (i.e. the address before the phandle). IOW, the first use of_bus_n_addr_cells() was correct, but the 2nd use is not correct. That's not great, but that's how this code passed down on stone tablets works... As I commented on v1 of the original patch. I don't like parsing interrupt-map in 2 places. It leads to problems exactly as this thread has shown. The duplication was reduced some, but is still more than I'd like. So I'm looking at how to refactor of_irq_parse_raw() to make it work for what's needed here. Maybe I'll have a fix quickly. If not, I'm going to revert the original commit. Rob