On Fri, Feb 10, 2023 at 03:06:37PM +0000, Marc Zyngier wrote: > On Fri, 10 Feb 2023 12:57:40 +0000, > Johan Hovold <johan@xxxxxxxxxx> wrote: > > > > On Fri, Feb 10, 2023 at 11:38:58AM +0000, Marc Zyngier wrote: > > > On Fri, 10 Feb 2023 09:56:03 +0000, > > > Johan Hovold <johan@xxxxxxxxxx> wrote: > > > > > > @@ -1132,6 +1147,7 @@ struct irq_domain *irq_domain_create_hierarchy(struct irq_domain *parent, > > > > > > else > > > > > > domain = irq_domain_create_tree(fwnode, ops, host_data); > > > > > > if (domain) { > > > > > > + domain->root = parent->root; > > > > > > domain->parent = parent; > > > > > > domain->flags |= flags; > > > > > > > > > > So we still have a bug here, as we have published a domain that we > > > > > keep updating. A parallel probing could find it in the interval and do > > > > > something completely wrong. > > > > > > > > Indeed we do, even if device links should make this harder to hit these > > > > days. > > > > > > > > > Splitting the work would help, as per the following patch. > > > > > > > > Looks good to me. Do you want to submit that as a patch that I'll rebase > > > > on or should I submit it as part of a v6? > > > > > > Just take it directly. > > > > Ok, thanks. I've added a commit message and turned it into a patch to include in v6 now: commit 3af395aa894c7df94ef2337e572e5e1710b4bbda (HEAD -> work) Author: Marc Zyngier <maz@xxxxxxxxxx> Date: Thu Feb 9 16:00:55 2023 +0000 irqdomain: Fix domain registration race Hierarchical domains created using irq_domain_create_hierarchy() are currently added to the domain list before having been fully initialised. This specifically means that a racing allocation request might fail to allocate irq data for the inner domains of a hierarchy in case the parent domain pointer has not yet been set up. Note that this is not really any issue for irqchip drivers that are registered early via IRQCHIP_DECLARE() or IRQCHIP_ACPI_DECLARE(), but could potentially cause trouble with drivers that are registered later (e.g. when using IRQCHIP_PLATFORM_DRIVER_BEGIN(), gpiochip drivers, etc.). Fixes: afb7da83b9f4 ("irqdomain: Introduce helper function irq_domain_add_hierarchy()") Cc: stable@xxxxxxxxxxxxxxx # 3.19 ... [ johan: add a commit message ] Signed-off-by: Johan Hovold <johan+linaro@xxxxxxxxxx> Could you just give your SoB for the diff here so I can credit you as author? > > I guess this turns the "Use irq_domain_create_hierarchy()" patches into > > fixes that should be backported as well. > > Maybe. Backports are not my immediate concern. Turns out all of those drivers are registered early via IRQCHIP_DECLARE() or IRQCHIP_ACPI_DECLARE() so there shouldn't really be any risk of hitting this race for those. > > But note that your proposed diff may not be sufficient to prevent > > lookups from racing with domain registration generally. Many drivers > > still update the bus token after the domain has been added (and > > apparently some still set flags also after creating hierarchies I just > > noticed, e.g. amd_iommu_create_irq_domain). > > The bus token should only rarely be a problem, as it is often set on > an intermediate level which isn't directly looked-up by anything else. > And if it did happen, it would probably result in a the domain not > being found. > > Flags, on the other hand, are more problematic. But I consider this a > driver bug which should be fixed independently. I agree. > > It seems we'd need to expose a separate allocation and registration > > interface, or at least pass in the bus token to a new combined > > interface. > > Potentially, yes. But this could come later down the line. I'm more > concerned in getting this series into -next, as the merge window is > fast approaching. I'll post a v6 first thing Monday if you can give me that SoB before then. Johan