On Sat, Mar 09, 2013 at 03:20:57PM -0700, Myron Stowe wrote: > On Sat, Mar 9, 2013 at 1:49 PM, Neil Horman <nhorman@xxxxxxxxxxxxx> wrote: > > On Mon, Mar 04, 2013 at 02:04:19PM -0500, Neil Horman wrote: > >> A few years back intel published a spec update: > >> http://www.intel.com/content/dam/doc/specification-update/5520-and-5500-chipset-ioh-specification-update.pdf > >> > >> For the 5520 and 5500 chipsets which contained an errata (specificially errata > >> 53), which noted that these chipsets can't properly do interrupt remapping, and > >> as a result the recommend that interrupt remapping be disabled in bios. While > >> many vendors have a bios update to do exactly that, not all do, and of course > >> not all users update their bios to a level that corrects the problem. As a > >> result, occasionally interrupts can arrive at a cpu even after affinity for that > >> interrupt has be moved, leading to lost or spurrious interrupts (usually > >> characterized by the message: > >> kernel: do_IRQ: 7.71 No irq handler for vector (irq -1) > >> > >> There have been several incidents recently of people seeing this error, and > >> investigation has shown that they have system for which their BIOS level is such > >> that this feature was not properly turned off. As such, it would be good to > >> give them a reminder that their systems are vulnurable to this problem. > >> > >> Signed-off-by: Neil Horman <nhorman@xxxxxxxxxxxxx> > >> CC: Prarit Bhargava <prarit@xxxxxxxxxx> > >> CC: Don Zickus <dzickus@xxxxxxxxxx> > >> CC: Don Dutile <ddutile@xxxxxxxxxx> > >> CC: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > >> CC: Asit Mallick <asit.k.mallick@xxxxxxxxx> > >> CC: linux-pci@xxxxxxxxxxxxxxx > >> > > Ping, anyone want to Ack/Nack this? > > Don's comment earlier seems to imply that this is a short term fix and > that a more long term fix may be coming soon. If that is the case > wouldn't we want to wait for the long term fix and just pull that in? > > Myron > As Don and Prarit have mentioned, an alternate change is being worked on and tested that may work around this issue, but we're not yet sure that it will, and we're not sure of the time frame for this fix. Normally I would agree, that it would be easier just to wait for the long term fix, but as Prarit noted, since this hardware is in fact broken, I would rather do a both approach. Its fine if this gets reverted tomorrow with a longer term fix as far as I'm concerned, its just caused enough problems already that I'd like to see it in place until the better solution arrives. Neil -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html