On Tue, Jun 25, 2024 at 10:57:42AM +0200, Daniel Wagner wrote: > On Tue, Jun 25, 2024 at 09:07:30AM GMT, Thomas Gleixner wrote: > > On Tue, Jun 25 2024 at 08:37, Hannes Reinecke wrote: > > > On 6/24/24 11:00, Daniel Wagner wrote: > > >> On Mon, Jun 24, 2024 at 10:47:05AM GMT, Christoph Hellwig wrote: > > >>>> Do you think we should introduce a new type or just use the existing > > >>>> managed_irq for this? > > >>> > > >>> No idea really. What was the reason for adding a new one? > > >> > > >> I've added the new type so that the current behavior of spreading the > > >> queues over to the isolated CPUs is still possible. I don't know if this > > >> a valid use case or not. I just didn't wanted to kill this feature it > > >> without having discussed it before. > > >> > > >> But if we agree this doesn't really makes sense with isolcpus, then I > > >> think we should use the managed_irq one as nvme-pci is using the managed > > >> IRQ API. > > >> > > > I'm in favour in expanding/modifying the managed irq case. > > > For managed irqs the driver will be running on the housekeeping CPUs > > > only, and has no way of even installing irq handlers for the isolcpus. > > > > Yes, that's preferred, but please double check with the people who > > introduced that in the first place. > > The relevant code was added by Ming: > > 11ea68f553e2 ("genirq, sched/isolation: Isolate from handling managed > interrupts") > > [...] it can happen that a managed interrupt whose affinity > mask contains both isolated and housekeeping CPUs is routed to an isolated > CPU. As a consequence IO submitted on a housekeeping CPU causes interrupts > on the isolated CPU. > > Add a new sub-parameter 'managed_irq' for 'isolcpus' and the corresponding > logic in the interrupt affinity selection code. > > The subparameter indicates to the interrupt affinity selection logic that > it should try to avoid the above scenario. > [...] > > From the commit message I read the original indent is that managed_irq > should avoid speading queues on isolcated CPUs. > > Ming, do you agree to use the managed_irq mask to limit the queue > spreading on isolated CPUs? It would make the io_queue option obsolete. Yes, managed_irq is introduced for not spreading on isolated CPUs, and it is supposed to work well. The only problem of managed_irq is just that isolated CPUs are spread, but they are excluded from irq effective masks. Thanks, Ming