Re: [patch v6 7/7] genirq/affinity: Add support for non-managed affinity sets

Ming Lei <ming.lei@xxxxxxxxxx> · Mon, 18 Feb 2019 10:49:23 +0800

Hi Thomas,

On Sun, Feb 17, 2019 at 08:17:05PM +0100, Thomas Gleixner wrote:
> On Sun, 17 Feb 2019, Ming Lei wrote:
> > On Sat, Feb 16, 2019 at 06:13:13PM +0100, Thomas Gleixner wrote:
> > > Some drivers need an extra set of interrupts which should not be marked
> > > managed, but should get initial interrupt spreading.
> > 
> > Could you share the drivers and their use case?
> 
> You were Cc'ed on that old discussion:
> 
>  https://lkml.kernel.org/r/300d6fef733ca76ced581f8c6304bac6@xxxxxxxxxxxxxx

Thanks for providing the link.

> 
> > > For both interrupt sets the interrupts are properly spread out, but the
> > > second set is not marked managed.
> > 
> > Given drivers only care the managed vs non-managed interrupt numbers,
> > just wondering why this case can't be covered by .pre_vectors &
> > .post_vectors?
> 
> Well, yes, but post/pre are not subject to spreading and I really don't
> want to go there.
> 
> > Also this kind of usage may break blk-mq easily, in which the following
> > rule needs to be respected:
> > 
> > 1) all CPUs are required to spread among each interrupt set
> > 
> > 2) no any CPU is shared between two IRQs in same set.
> 
> I don't see how that would break blk-mq. The unmanaged set is not used by
> the blk-mq stuff, that's some driver internal voodoo. So blk-mq still gets
> a perfectly spread and managed interrupt set for the queues.

>From the discussion above, the use case is for megaraid_sas. And one of the
two interrupt sets(managed and non-managed) will be chosen according to
workloads runtime.

Each interrupt set actually defines one blk-mq queue mapping, and the
queue mapping needs to respect the rule I mentioned now. However,
non-managed affinity can be changed to any way anytime by user-space.

Recently HPSA tried to add one module parameter to use non-managed
IRQ[1].

Also NVMe RDMA uses non-managed interrupts, and at least one CPU hotplug
issue is never fixed yet[2]. 

[1] https://marc.info/?t=154387665200001&r=1&w=2
[2] https://www.spinics.net/lists/linux-block/msg24140.html

thanks,
Ming