> On Feb 12, 2020, at 6:19 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > Qian, > > Qian Cai <cai@xxxxxx> writes: > >> The linux-next commit 6f1a4891a592 (“x86/apic/msi: Plug non-maskable >> MSI affinity race”) Introduced a bug which is always triggered during >> the CPU hotplugs on this server. See the trace and line numbers below. > > Thanks for the report. > >> WARNING: CPU: 0 PID: 2794 at arch/x86/kernel/apic/msi.c:103 msi_set_affinity+0x278/0x330 >> CPU: 0 PID: 2794 Comm: irqbalance Tainted: G L 5.6.0-rc1-next-20200211 #1 >> irq_do_set_affinity at kernel/irq/manage.c:259 >> irq_setup_affinity at kernel/irq/manage.c:474 >> irq_select_affinity_usr at kernel/irq/manage.c:496 >> write_irq_affinity.isra.0+0x137/0x1e0 >> irq_affinity_proc_write+0x19/0x20 > ... > > I'm glad I added this WARN_ON(). This unearthed another long standing > bug. If user space writes a bogus affinity mask, i.e. no online CPUs > then it calls irq_select_affinity_usr(). > > This was introduced for ALPHA in > > eee45269b0f5 ("[PATCH] Alpha: convert to generic irq framework (generic part)") > > and subsequently made available for all architectures in > > 18404756765c ("genirq: Expose default irq affinity mask (take 3)") > > which already introduced the circumvention of the affinity setting > restrictions for interrupts which cannot be moved in process context. > > The whole exercise is bogus in various aspects: > > 1) If the interrupt is already started up then there is absolutely > no point to honour a bogus interrupt affinity setting from user > space. The interrupt is already assigned to an online CPU and it > does not make any sense to reassign it to some other randomly > chosen online CPU. > > 2) If the interupt is not yet started up then there is no point > either. A subsequent startup of the interrupt will invoke > irq_setup_affinity() anyway which will chose a valid target CPU. > > So the right thing to do is to just return -EINVAL in case user space > wrote an affinity mask which does not contain any online CPUs, except for > ALPHA which has it's own magic sauce for this. > > Can you please test the patch below? Yes, it works fine. > > Thanks, > > tglx > > 8<--------------- > diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h > index 3924fbe829d4..c9d8eb7f5c02 100644 > --- a/kernel/irq/internals.h > +++ b/kernel/irq/internals.h > @@ -128,8 +128,6 @@ static inline void unregister_handler_proc(unsigned int irq, > > extern bool irq_can_set_affinity_usr(unsigned int irq); > > -extern int irq_select_affinity_usr(unsigned int irq); > - > extern void irq_set_thread_affinity(struct irq_desc *desc); > > extern int irq_do_set_affinity(struct irq_data *data, > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c > index 3089a60ea8f9..7eee98c38f25 100644 > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -481,23 +481,9 @@ int irq_setup_affinity(struct irq_desc *desc) > { > return irq_select_affinity(irq_desc_get_irq(desc)); > } > -#endif > +#endif /* CONFIG_AUTO_IRQ_AFFINITY */ > +#endif /* CONFIG_SMP */ > > -/* > - * Called when a bogus affinity is set via /proc/irq > - */ > -int irq_select_affinity_usr(unsigned int irq) > -{ > - struct irq_desc *desc = irq_to_desc(irq); > - unsigned long flags; > - int ret; > - > - raw_spin_lock_irqsave(&desc->lock, flags); > - ret = irq_setup_affinity(desc); > - raw_spin_unlock_irqrestore(&desc->lock, flags); > - return ret; > -} > -#endif > > /** > * irq_set_vcpu_affinity - Set vcpu affinity for the interrupt > diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c > index 9e5783d98033..af488b037808 100644 > --- a/kernel/irq/proc.c > +++ b/kernel/irq/proc.c > @@ -111,6 +111,28 @@ static int irq_affinity_list_proc_show(struct seq_file *m, void *v) > return show_irq_affinity(AFFINITY_LIST, m); > } > > +#ifndef CONFIG_AUTO_IRQ_AFFINITY > +static inline int irq_select_affinity_usr(unsigned int irq) > +{ > + /* > + * If the interrupt is started up already then this fails. The > + * interrupt is assigned to an online CPU already. There is no > + * point to move it around randomly. Tell user space that the > + * selected mask is bogus. > + * > + * If not then any change to the affinity is pointless because the > + * startup code invokes irq_setup_affinity() which will select > + * a online CPU anyway. > + */ > + return -EINVAL; > +} > +#else > +/* ALPHA magic affinity auto selector. Keep it for historical reasons. */ > +static inline int irq_select_affinity_usr(unsigned int irq) > +{ > + return irq_select_affinity(irq); > +} > +#endif > > static ssize_t write_irq_affinity(int type, struct file *file, > const char __user *buffer, size_t count, loff_t *pos)