--- On Sat, 8/28/10, Grant Grundler <grundler@xxxxxxxxxxxxxxxx> wrote: > From: Grant Grundler <grundler@xxxxxxxxxxxxxxxx> > Subject: Re: Linux mask_msi_irq() question > To: "Kanoj Sarcar" <kanojsarcar@xxxxxxxxx> > Cc: "Grant Grundler" <grundler@xxxxxxxxxxxxxxxx>, "Jesse Barnes" <jbarnes@xxxxxxxxxxxxxxxx>, linux-pci@xxxxxxxxxxxxxxx > Date: Saturday, August 28, 2010, 11:25 PM > On Fri, Aug 27, 2010 at 02:03:05AM > -0700, Kanoj Sarcar wrote: > .... > > > No. I agree that violates the portions of the > PCIE spec > > > already quoted in this thread. > > > > Just to be careful, I quoted what I think are relevant > parts > > of the spec, but also expressed my confusion regarding > whether > > those parts require an interrupt barrier to be > provided by > > devices or not. I don't have a conviction either way > on this; > > practically, I think there will be devices that do > provide > > barriers, and some that don't. > > Ah...I think I understand now. The device is require to > treat > the MMIO write as a barrier once it sees that write. The > "flush" > MMIO read is the mechanism used by the host to guarantee > the device > has in fact seen the MMIO write that masks the MSI. I'm not > aware > of any other operation which guarantees the device saw the > MMIO write. > > If we could prove the kernel can both tolerate and recover > from some > "spurious" MSI transactions, then it's probably safe to > remove > the "flush" MMIO read. That's really NOT an easy thing to > prove. If we can _not_ prove the kernel can withstand spurious interrupts, it is liable to crash/nmi with at least some (illbehaved?) devices. To prove that kernel can indeed tolerate spurious interrupts, I think all the platform maintainers need to agree, correct? Plus look at implications on interrupt migration etc. Kanoj > > hth, > grant > > > > > > > > > > > > > > ... > > > > In general, I agree. I am under the > assumption though > > > that > > > > Linux handles misbehaving devices that > generate > > > spurious > > > > interrupts by turning them off after a > while. > > > > > > MSI is the equivalent to "Edge Triggered". > Another > > > interrupt > > > should not be sent until the previous one is > ACK'd. > > > IIRC, the "spurious" interrupt code waits until > 100K > > > interrupts > > > have been sent (or something in that order of > magnitude). > > > > > > > The point I was trying to make is that there is code > to > > protect platform/kernel against errant devices, so > that > > crash, NMI's etc do not happen. > > > > > > > > > So, I think > > > > each port is taking care of handling such > interrupts > > > by > > > > potentially maintaining state whether the > vector > > > should > > > > trigger or not, which is checked before > invoking > > > driver isr. > > > > > > Ok - the generic interrupt code is providing > that. That > > > is exactly what I meant needs to be handled. > > > > Agreed. > > > > > > > > I was thinking the context is some random driver > is trying > > > to mask a device's MSI for some reason. > > > > > > > > > > > > > > > > > I suspect any code that mucks with > interrupt > > > state would > > > > > also depend > > > > > on the interrupt not triggering at that > moment. > > > > > > > > > > > > > At least from the part that I understand, I > agree to > > > the above. > > > > When x86 receives an msix, it masks the > entry and > > > reads the > > > > mask back on the interrupt cpu. The > interrupt is not > > > > triggering at this point. Which begs the > question: why > > > readback > > > > at this point? > > > > > > I suspect to limit the window a "spurious" > interrupt would > > > get sent by the device. If we are protecting from > that > > > case > > > anyway, I think you are probably right that it's > ok to > > > drop the flush. > > > > Ok. > > > > > > > > > > > > The part that I don't understand is how > intercpu > > > synchronization > > > > is achieved in irq rebalancing case, device > deinit > > > case, etc. > > > > > > It's been 5+ years since I've look at the IRQ > > > "rebalancing" > > > (interrupt migration) sequence. I don't recall > enough to be > > > useful. > > > > > > Device devinit should generate NO interrupts - > that's a bug > > > in > > > the device driver if the device is generating > interrupts > > > before > > > the IRQ or MSI is initialized and can handle > them. > > > > No, what I meant is that if driver is unloading, > driver does > > free_irq(), but a laggard device interrupt still > creeps in > > after free_irq() is complete. > > > > > > > > Device driver module unload is the other time > I've seen > > > ugly > > > race conditions (between interrupts and DMA > unmapping > > > mostly). > > > > > > > Does Linux currently take a strict approach > to > > > barriers in > > > > these cases, or is it a loose approach where > a laggard > > > instance > > > > of the interrupt can still creep to an > unintended cpu? > > > In the > > > > loose approach case, the readback does not > make much > > > sense to > > > > me. > > > > > > I don't know offhand. In general, I believe > stricter > > > approaches help > > > keep the systems more robust in the face of > other > > > issues. Race conditions > > > around interrupt handling are pretty hard to > debug. > > > > No arguments; the point here is whatever Linux does, > there > > are probably devices which will send out interrupts > when > > they shouldn't. Maybe .001% of the time under > specific > > loads. > > > > > Removing this > > > MMIO read needs more careful analysis than I'm > willing to > > > put into it. > > > Both to justify the effort (some measurable > performance > > > benefit) and > > > "prove" it will work under all conditions. > > > > I think removing a (unrequired?) pio read at every > interrupt > > should show measurable improvements (even with intr > > coalescing/moderation). You are right though, its a > significant > > effort, across various platform arches and device > drivers. > > > > I began the thread with a pointer to the patch that > introduced > > the pio read 3 years back; I was hoping the relevant > folks > > that pushed the patch would contribute to this > discussion. > > Unfortunately, for something this subtle, its much > harder to > > rationalize eliminating it after this long. > > > > Kanoj > > > > > > > > hth, > > > grant > > > > > > > > > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html