Re: Potential issue with (or misunderstanding of) of_irq_get()

Marc Zyngier <maz@xxxxxxxxxx> · Sun, 21 May 2023 13:38:11 +0100

On Fri, 19 May 2023 12:02:47 +0100,
Conor Dooley <conor.dooley@xxxxxxxxxxxxx> wrote:
> 
> [1  <text/plain; us-ascii (quoted-printable)>]
> Hey!
> 
> I've run into an issue with of_irq_get() while writing an irqchip driver
> and I was hoping that by posting about it I might get some guidance as
> to whether I just doing something fundamentally wrong in my code, or
> if the specific case was just an oversight.
> 
> I've been trying to solve the issue that I pointed out here:
> https://lore.kernel.org/linux-gpio/23a69be6-96d3-1c28-f1aa-555e38ff991e@xxxxxxxxxxxxx/
> 
> To spare reading that, the TL;DR is that the SoC has 3 GPIO controllers,
> with 14, 24 and 32 GPIOs each. All 68 can be used for interrupts.
> The PLIC only has 41 interrupts for GPIOs, so there's a bit of extra RTL
> sitting between the GPIO controllers and the PLIC, that is runtime
> configurable, deciding whether an GPIO gets a PLIC interrupt of its
> own or shares an interrupt with other GPIOs from the same GPIO controller.
> 
> Since the interrupt router/mux is not part of the GPIO controller blocks,
> I have written a driver for the it & changed the representation in the DT
> to the below. For each of the 41 interrupts "consumed" by the driver
> bound to the irqmux node, I have created a domain.

In general, this feels a wee bit wrong.

>From what I understand of the HW, it is expected that most of the GPIO
interrupt will be directly associated with a PLIC interrupt in an 1:1
fashion (only 68 - 41 + 1 = 28 interrupts will be muxed). So 40 GPIOs
could have a chance of being directly assigned to a PLIC input without
any muxing.

If you start allocating a domain per interrupt, you end-up actively
preventing the use of hierarchical domains, and you don't really
benefit from what the mux HW can do for you.

[...]

> This approach in DT allows the GPIO controller driver to not care about
> the router/mux configuration, which makes sense to me as it is not part
> of those IP blocks.
> 
> My irqchip driver was adding domains like so:
> 
> 	for (; i < MPFS_MUX_NUM_IRQS; i++) {
> 		priv->irqchip_data[i].output_hwirq = i;
> 
> 		priv->irqchip_data[i].irq = irq_of_parse_and_map(node, i);
> 
> 		domain = irq_domain_add_linear(node, MPFS_MAX_IRQS_PER_GPIO,
> 					       &mpfs_irq_mux_nondirect_domain_ops,
> 					       &priv->irqchip_data[i]);
> 
> 		irq_set_chained_handler_and_data(priv->irqchip_data[i].irq,
> 						 mpfs_irq_mux_nondirect_handler,
> 						 &priv->irqchip_data[i]);
> 	}
> 
> In my irqchip's select callback I check the struct irq_fwspec's param[0]
> to determine which domain is actually responsible for it.

Huh. In general, if you want to resort to 'select', you're doing
something that is a bit iffy.

> 
> That's all working nicely & I was doing some cleanup before submitting,
> when I noticed that debugfs complained about the fact that I had several
> domains hanging off the same of device_node:
> debugfs: File ':soc:interrupt-controller@20002054' in directory 'domains' already present!
> debugfs: File ':soc:interrupt-controller@20002054' in directory 'domains' already present!

Of course. You get 41 domains with all the same node...

You really should only have one hierarchical domain that represents
all inputs. How you deal with the difference in handling probably
shouldn't be directly reflected at that level of the hierarchy, but
below the mux.

> To get around that, I tried to switch to creating fwnodes instead,
> one for each domain:
> 
> 	for (; i < MPFS_MUX_NUM_IRQS; i++) {
> 		priv->irqchip_data[i].output_hwirq = i;
> 
> 		priv->irqchip_data[i].irq = irq_of_parse_and_map(node, i);
> 
> 		fwnode = irq_domain_alloc_named_id_fwnode("mpfs-irq-mux", i);
> 
> 		domain = irq_domain_create_linear(fwnode, MPFS_MAX_IRQS_PER_GPIO,
> 						  &mpfs_irq_mux_nondirect_domain_ops,
> 						  &priv->irqchip_data[i]);
> 
> 		irq_set_chained_handler_and_data(priv->irqchip_data[i].irq,
> 						 mpfs_irq_mux_nondirect_handler,
> 						 &priv->irqchip_data[i]);
> 	}
> 
> That's grand for debugfs, but I then ran into a problem that made me feel
> I had designed myself into an incorrect corner.

Yup. Now that you have disassociated yourself from the firmware-based
naming, you cannot use it to drive the mapping and sh*t happens. The
thing is, named fwnode are only there as a band-aid to be able to
designate objects that have no fwnode representation.

And it goes downhill from there. My gut felling for this is that you
should try and build something that looks like this:

- the mux exposes a single hierarchical domain that is directly
  connected to the PLIC.

- the first 40 interrupt allocations are serviced by simply allocating
  a corresponding PLIC interrupt and configuring the mux to do its
  job.

- all the 28 other interrupts must be muxed onto a single PLIC. For
  these interrupts, you must make sure that the domain hierarchy gets
  truncated at the MUX level (see irq_domain_disconnect_hierarchy()
  for the gory details). They all get to be placed behind a chained
  interrupt handler, with their own irqchip ops.

That way, no repainting of fwnodes, no select/match complexity, and
must of the interrupts get to benefit from the hierarchical setup
(such as being able to set their affinity).

Of course, all of this is assuming that the HW is able to deal with a
large number of interrupts muxed to a single one. If not, you may have
to use more that one of these, but the idea is the same.

Thoughts?

	M.

-- 
Without deviation from the norm, progress is not possible.