Evan, Evan Green <evgreen@xxxxxxxxxxxx> writes: > On Tue, Jan 28, 2020 at 6:38 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: >> The patch is only lightly tested, but so far it survived. >> > > Hi Thomas, > Thanks for the patch, I gave it a try. I get the following splat, then a hang: > > [ 62.238406] CPU0 > [ 62.241135] ---- > [ 62.243863] lock(vector_lock); > [ 62.247467] lock(vector_lock); > [ 62.251071] > [ 62.251071] *** DEADLOCK *** > [ 62.251071] > [ 62.257687] May be due to missing lock nesting notation > [ 62.257687] > [ 62.265274] 2 locks held by migration/1/17: > [ 62.269946] #0: 00000000cfa9d8c3 (&irq_desc_lock_class){-.-.}, at: > irq_migrate_all_off_this_cpu+0x44/0x28f > [ 62.280846] #1: 000000006885da2d (vector_lock){-.-.}, at: > msi_set_affinity+0x13c/0x27b > [ 62.289801] > [ 62.289801] stack backtrace: > [ 62.294669] CPU: 1 PID: 17 Comm: migration/1 Not tainted 4.19.96 #2 > [ 62.310713] Call Trace: > [ 62.313446] dump_stack+0xac/0x11e > [ 62.317255] __lock_acquire+0x64f/0x19bc > [ 62.321646] ? find_held_lock+0x3d/0xb8 > [ 62.325936] ? pci_conf1_write+0x4f/0xdf > [ 62.330320] lock_acquire+0x1b2/0x1fa > [ 62.334413] ? apic_retrigger_irq+0x31/0x63 > [ 62.339097] _raw_spin_lock_irqsave+0x51/0x7d > [ 62.343972] ? apic_retrigger_irq+0x31/0x63 > [ 62.348646] apic_retrigger_irq+0x31/0x63 > [ 62.353124] msi_set_affinity+0x25a/0x27b Bah. I'm sure I looked at that call chain, noticed the double vector lock and then forgot. Delta patch below. Thanks, tglx 8<-------------- --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -64,6 +64,7 @@ msi_set_affinity(struct irq_data *irqd, struct irq_cfg old_cfg, *cfg = irqd_cfg(irqd); struct irq_data *parent = irqd->parent_data; unsigned int cpu; + bool pending; int ret; /* Save the current configuration */ @@ -147,9 +148,13 @@ msi_set_affinity(struct irq_data *irqd, * vector/CPU. Check whether the transition raced with a device * interrupt and is pending in the local APICs IRR. */ - if (lapic_vector_set_in_irr(cfg->vector)) - irq_data_get_irq_chip(irqd)->irq_retrigger(irqd); + pending = lapic_vector_set_in_irr(cfg->vector); + unlock_vector_lock(); + + if (pending) + irq_data_get_irq_chip(irqd)->irq_retrigger(irqd); + return ret; }