On Fri, Feb 10, 2023 at 11:38:58AM +0000, Marc Zyngier wrote: > On Fri, 10 Feb 2023 09:56:03 +0000, > Johan Hovold <johan@xxxxxxxxxx> wrote: > > > > On Thu, Feb 09, 2023 at 04:00:55PM +0000, Marc Zyngier wrote: > > > On Thu, 09 Feb 2023 13:23:23 +0000, > > > Johan Hovold <johan+linaro@xxxxxxxxxx> wrote: > > I went back and forth over that a bit, but decided to only use > > domain->root->mutex in paths that can be called for hierarchical > > domains (i.e. the "shared code paths" mentioned above). > > > > Using it in paths that are clearly only called for non-hierarchical > > domains where domain->root == domain felt a bit lazy. > > My concern here is that as this code gets further refactored, it may > become much harder to reason about what is the correct level of > locking. Yeah, that's conceivable. > > The counter argument is of course that using domain->root->lock allows > > people to think less about the code they are changing, but that's not > > necessarily always a good thing. > > Eventually, non-hierarchical domains should simply die and be replaced > with a single level hierarchy. Having a unified locking in place will > definitely make the required work clearer. > > > Also note that the lockdep asserts in the revmap helpers would catch > > anyone using domain->mutex where they should not (i.e. using > > domain->mutex for an hierarchical domain). > > Lockdep is great, but lockdep is a runtime thing. It doesn't help > reasoning about what gets locked when changing this code. Contributers are expected to test their changes with lockdep enabled, right? But sure, using root->domain->mutex throughout may prevent prevent people from getting this wrong. I'll update this for v6. > > > > @@ -1132,6 +1147,7 @@ struct irq_domain *irq_domain_create_hierarchy(struct irq_domain *parent, > > > > else > > > > domain = irq_domain_create_tree(fwnode, ops, host_data); > > > > if (domain) { > > > > + domain->root = parent->root; > > > > domain->parent = parent; > > > > domain->flags |= flags; > > > > > > So we still have a bug here, as we have published a domain that we > > > keep updating. A parallel probing could find it in the interval and do > > > something completely wrong. > > > > Indeed we do, even if device links should make this harder to hit these > > days. > > > > > Splitting the work would help, as per the following patch. > > > > Looks good to me. Do you want to submit that as a patch that I'll rebase > > on or should I submit it as part of a v6? > > Just take it directly. Ok, thanks. I guess this turns the "Use irq_domain_create_hierarchy()" patches into fixes that should be backported as well. But note that your proposed diff may not be sufficient to prevent lookups from racing with domain registration generally. Many drivers still update the bus token after the domain has been added (and apparently some still set flags also after creating hierarchies I just noticed, e.g. amd_iommu_create_irq_domain). It seems we'd need to expose a separate allocation and registration interface, or at least pass in the bus token to a new combined interface. Johan