Jiang Liu, On Thu, Dec 17, 2015 at 07:40:33PM -0800, Jeremiah Mahler wrote: > all, > > I just started getting these "No irq handler for vector" messages > after upgrading to linux-next 20151217+. > > > (from the first boot) > ... > [ 2.282652] [drm] Initialized drm 1.1.0 20060810 > [ 2.318806] AVX version of gcm_enc/dec engaged. > [ 2.318810] AES CTR mode by8 optimization enabled > [ 2.324446] do_IRQ: 0.35 No irq handler for vector > [ 2.366146] iTCO_vendor_support: vendor-support=0 > [ 2.372762] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11 > ... > [ 9.249887] wlan0: associate with 2c:5d:93:09:50:48 (try 1/3) > [ 9.265206] wlan0: RX AssocResp from 2c:5d:93:09:50:48 (capab=0x421 status=0 aid=8) > [ 9.284088] wlan0: associated > [ 10.453048] do_IRQ: 0.35 No irq handler for vector > [ 10.457923] do_IRQ: 0.35 No irq handler for vector > [ 10.457932] do_IRQ: 0.35 No irq handler for vector > [ 10.501026] do_IRQ: 0.35 No irq handler for vector > [ 10.501033] do_IRQ: 0.35 No irq handler for vector > [ 10.513951] do_IRQ: 0.35 No irq handler for vector > ... > > > (second boot, and after a resume) > ... > [10527.998694] PM: noirq resume of devices complete after 21.488 msecs > [10527.999578] PM: early resume of devices complete after 0.850 msecs > [10528.000525] rtc_cmos 00:02: System wakeup disabled by ACPI > [10528.005265] do_IRQ: 0.84 No irq handler for vector > [10528.005450] sd 0:0:0:0: [sda] Starting disk > [10528.021257] tpm_tis 00:05: TPM is disabled/deactivated (0x6) > ... > [10530.005541] PM: resume of devices complete after 2005.925 msecs > [10530.005690] usb 3-1.4:1.0: rebind failed: -517 > [10530.005696] usb 3-1.4:1.1: rebind failed: -517 > [10530.006575] Restarting tasks ... > [10530.008347] do_IRQ: 0.84 No irq handler for vector > [10530.021258] done. > [10530.042883] Bluetooth: hci0: BCM: chip id 63 > ... > [10559.005603] mei_me 0000:00:16.0: timer: init clients timeout hbm_state = 1. > [10559.005612] mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS fw status = 1E000245 60000106 > [10559.009508] do_IRQ: 0.84 No irq handler for vector > [10561.005639] mei_me 0000:00:16.0: wait hw ready failed > [10561.005644] mei_me 0000:00:16.0: hw_start failed ret = -62 > ... > > > I can test patches if anyone has any ideas :-) > > -- > - Jeremiah Mahler I performed a bisect and found that the following patch introduced the bug, which is still present in the latest linux-next 20151218+. From 41c7518a5d14543fa4aa1b5b9994ac26b38c0406 Mon Sep 17 00:00:00 2001 From: Jiang Liu <jiang.liu@xxxxxxxxxxxxxxx> Date: Mon, 30 Nov 2015 16:09:29 +0800 Subject: [PATCH] x86/irq: Fix a race condition between vector assigning and cleanup Joe Lawrence reported an use after release issue related to x86 IRQ management code. Please refer to the following link for more information: http://lkml.kernel.org/r/5653B688.4050809@xxxxxxxxxxx Thomas pointed out that it's caused by a race condition between __assign_irq_vector() and __send_cleanup_vector(). Based on Thomas' draft patch, we solve this race condition by: 1) Use move_in_progress to signal that an IRQ cleanup IPI is needed 2) Use old_domain to save old CPU mask for IRQ cleanup 3) Use vector to protect move_in_progress and old_domain This bugfix patch also helps to get rid of that atomic allocation in __send_cleanup_vector(). Fixes: a782a7e46bb5 "x86/irq: Store irq descriptor in vector array" Reported-and-tested-by: Joe Lawrence <joe.lawrence@xxxxxxxxxxx> Signed-off-by: Jiang Liu <jiang.liu@xxxxxxxxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx Link: http://lkml.kernel.org/r/1448870970-1461-4-git-send-email-jiang.liu@xxxxxxxxxxxxxxx Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> --- arch/x86/kernel/apic/vector.c | 77 +++++++++++++++++++------------------------ 1 file changed, 34 insertions(+), 43 deletions(-) ... -- - Jeremiah Mahler -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html