Re: [PATCH v10 2/2] irqchip/loongarch-avec: Add AVEC irqchip support

Thomas Gleixner <tglx@xxxxxxxxxxxxx> · Wed, 21 Aug 2024 16:31:32 +0200

On Wed, Aug 21 2024 at 21:14, Huacai Chen wrote:
> On Wed, Aug 21, 2024 at 12:29 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>> This patch is doing too many things at once and is absolutely not
>> reviewable.
>>
>> Please split it up into the obvious bits and pieces:
> Splitting may cause another problem: some patches will get upstream
> via the arch tree and others via the irq tree. These dependencies may
> cause build errors in a certain tree. But anyway, we will try our best
> to do this.

That's not a problem at all. The trivial way to solve this is to apply
the architecture changes to the loongarch tree in a separate branch
which is based of some -rcX tag and only contains those dependencies.
That branch is then merged into the main loongarch branch and I can pull
it in to my tree for adding the irqchip changes. No conflicts, no merge
dependencies, nothing.

>>       #ifdef CONFIG_IRQ_LOONGARCH_AVEC
>>       # define SMP_CLEAR_VECTOR         BIT(ACTION_CLEAR_VECTOR)
>>       #else
>>       # define SMP_CLEAR_VECTOR         (0)
>>       #endif
>>
>>       That way the compiler will optimize out stuff like the
>>       SMP_CLEAR_VECTOR handling and you only need the prototype of
>>       complete_irq_moving(), but no implementation.
> These macros are not in hot-path, and we have already tried our best
> to avoid using #ifdefs for cpu_has_xxx, so I suggest not introduce a
> new Kconfig option. Moreover, the new option should always be selected
> due to the deep coupling among loongson's irqchips, which makes the
> #ifdefs useless.

They are removed in step 8 again. It's for having a sanely split up and
structured patch series instead of one big lump.

>> > +static void clear_free_vector(struct irq_data *irqd)
>> > +{
>> > +     struct avecintc_data *adata = irq_data_get_irq_chip_data(irqd);
>> > +     bool managed = irqd_affinity_is_managed(irqd);
>>
>> Don't even try. Your managed support is broken at the allocation side
>> and at several other places.
> I'm a bit confused here, irq_create_affinity_masks() marks some
> interrupts "managed". So if we completely ignore "managed" here, then
> can irq_create_affinity_masks() still work? Or the two has nothing to
> do with each other?

Managed interrupts have the property that the core and irqchip code
guarantees the interrupts to be affinable to the CPU masks which are
handed in to the allocator for them.

So the requirement for architectures which have a limited number of
vectors per CPU (x86, loongarch) is that you have to reserve the managed
vectors right at allocation time.

x86_vector_alloc_irqs()
    assign_irq_vector_policy()
        if (managed)
            reserve_managed_vector()
               irq_matrix_reserve_managed(mask)

irq_matrix_reserve_managed() then reserves a vector on all CPUs which
are in the affinity mask.

On activation:

x86_vector_activate()
   if (managed)
      activate_managed()
        assign_managed_vector(mask)
           irq_matrix_alloc_managed(mask)

irq_matrix_alloc_managed() then picks an unassigned vector out of the
managed vector space. Similar mechanism when the affinity is set.

Why is this important for x86 (and loongarch)?

Because both have a limited vector space of 256 vectors per CPU, where
some of the vectors might be reserved for exceptions and OS purposes or
the legacy space.

The managed mechanism guarantees at allocation time that the interrupt
will have a reserved vector on all CPUs in the mask. That ensures that
on CPU hotplug the interrupt can be migrated over to a still online CPU
in the mask. If the last CPU goes offline the interrupt is shut down.

You might not yet have run into the situation of vector exhaustion, but
once your number of CPUs gets big enough that is guaranteed to happen.

That's why x86 also uses the concept of reserved (not guaranteed)
regular vectors for non-managed interrupts. x86 uses the spurious vector
for that. That's important because there are enough device drivers out
there which allocate a gazillion of interrupts at probe time, but only
request a few of them.

If you allocate all vectors for them right upfront, then you exhaust
your vector space quickly for no reason. Only when the interrupt is
requested then a usable vector is allocated - or the allocation fails
and request_irq() fails. That's better than exhausting the vector space
for nothing.

The complexity of the x86 allocation/activate/set_affinity mechanisms
is there for a reason and not just because we did not have anything
better to do. :)

Thanks,

        tglx