On Wed, Aug 21 2024 at 21:14, Huacai Chen wrote: > On Wed, Aug 21, 2024 at 12:29 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: >> This patch is doing too many things at once and is absolutely not >> reviewable. >> >> Please split it up into the obvious bits and pieces: > Splitting may cause another problem: some patches will get upstream > via the arch tree and others via the irq tree. These dependencies may > cause build errors in a certain tree. But anyway, we will try our best > to do this. That's not a problem at all. The trivial way to solve this is to apply the architecture changes to the loongarch tree in a separate branch which is based of some -rcX tag and only contains those dependencies. That branch is then merged into the main loongarch branch and I can pull it in to my tree for adding the irqchip changes. No conflicts, no merge dependencies, nothing. >> #ifdef CONFIG_IRQ_LOONGARCH_AVEC >> # define SMP_CLEAR_VECTOR BIT(ACTION_CLEAR_VECTOR) >> #else >> # define SMP_CLEAR_VECTOR (0) >> #endif >> >> That way the compiler will optimize out stuff like the >> SMP_CLEAR_VECTOR handling and you only need the prototype of >> complete_irq_moving(), but no implementation. > These macros are not in hot-path, and we have already tried our best > to avoid using #ifdefs for cpu_has_xxx, so I suggest not introduce a > new Kconfig option. Moreover, the new option should always be selected > due to the deep coupling among loongson's irqchips, which makes the > #ifdefs useless. They are removed in step 8 again. It's for having a sanely split up and structured patch series instead of one big lump. >> > +static void clear_free_vector(struct irq_data *irqd) >> > +{ >> > + struct avecintc_data *adata = irq_data_get_irq_chip_data(irqd); >> > + bool managed = irqd_affinity_is_managed(irqd); >> >> Don't even try. Your managed support is broken at the allocation side >> and at several other places. > I'm a bit confused here, irq_create_affinity_masks() marks some > interrupts "managed". So if we completely ignore "managed" here, then > can irq_create_affinity_masks() still work? Or the two has nothing to > do with each other? Managed interrupts have the property that the core and irqchip code guarantees the interrupts to be affinable to the CPU masks which are handed in to the allocator for them. So the requirement for architectures which have a limited number of vectors per CPU (x86, loongarch) is that you have to reserve the managed vectors right at allocation time. x86_vector_alloc_irqs() assign_irq_vector_policy() if (managed) reserve_managed_vector() irq_matrix_reserve_managed(mask) irq_matrix_reserve_managed() then reserves a vector on all CPUs which are in the affinity mask. On activation: x86_vector_activate() if (managed) activate_managed() assign_managed_vector(mask) irq_matrix_alloc_managed(mask) irq_matrix_alloc_managed() then picks an unassigned vector out of the managed vector space. Similar mechanism when the affinity is set. Why is this important for x86 (and loongarch)? Because both have a limited vector space of 256 vectors per CPU, where some of the vectors might be reserved for exceptions and OS purposes or the legacy space. The managed mechanism guarantees at allocation time that the interrupt will have a reserved vector on all CPUs in the mask. That ensures that on CPU hotplug the interrupt can be migrated over to a still online CPU in the mask. If the last CPU goes offline the interrupt is shut down. You might not yet have run into the situation of vector exhaustion, but once your number of CPUs gets big enough that is guaranteed to happen. That's why x86 also uses the concept of reserved (not guaranteed) regular vectors for non-managed interrupts. x86 uses the spurious vector for that. That's important because there are enough device drivers out there which allocate a gazillion of interrupts at probe time, but only request a few of them. If you allocate all vectors for them right upfront, then you exhaust your vector space quickly for no reason. Only when the interrupt is requested then a usable vector is allocated - or the allocation fails and request_irq() fails. That's better than exhausting the vector space for nothing. The complexity of the x86 allocation/activate/set_affinity mechanisms is there for a reason and not just because we did not have anything better to do. :) Thanks, tglx