On Fri, Aug 27, 2010 at 02:03:05AM -0700, Kanoj Sarcar wrote: .... > > No. I agree that violates the portions of the PCIE spec > > already quoted in this thread. > > Just to be careful, I quoted what I think are relevant parts > of the spec, but also expressed my confusion regarding whether > those parts require an interrupt barrier to be provided by > devices or not. I don't have a conviction either way on this; > practically, I think there will be devices that do provide > barriers, and some that don't. Ah...I think I understand now. The device is require to treat the MMIO write as a barrier once it sees that write. The "flush" MMIO read is the mechanism used by the host to guarantee the device has in fact seen the MMIO write that masks the MSI. I'm not aware of any other operation which guarantees the device saw the MMIO write. If we could prove the kernel can both tolerate and recover from some "spurious" MSI transactions, then it's probably safe to remove the "flush" MMIO read. That's really NOT an easy thing to prove. hth, grant > > > > > > > ... > > > In general, I agree. I am under the assumption though > > that > > > Linux handles misbehaving devices that generate > > spurious > > > interrupts by turning them off after a while. > > > > MSI is the equivalent to "Edge Triggered". Another > > interrupt > > should not be sent until the previous one is ACK'd. > > IIRC, the "spurious" interrupt code waits until 100K > > interrupts > > have been sent (or something in that order of magnitude). > > > > The point I was trying to make is that there is code to > protect platform/kernel against errant devices, so that > crash, NMI's etc do not happen. > > > > > > So, I think > > > each port is taking care of handling such interrupts > > by > > > potentially maintaining state whether the vector > > should > > > trigger or not, which is checked before invoking > > driver isr. > > > > Ok - the generic interrupt code is providing that. That > > is exactly what I meant needs to be handled. > > Agreed. > > > > > I was thinking the context is some random driver is trying > > to mask a device's MSI for some reason. > > > > > > > > > > > > > I suspect any code that mucks with interrupt > > state would > > > > also depend > > > > on the interrupt not triggering at that moment. > > > > > > > > > > At least from the part that I understand, I agree to > > the above. > > > When x86 receives an msix, it masks the entry and > > reads the > > > mask back on the interrupt cpu. The interrupt is not > > > triggering at this point. Which begs the question: why > > readback > > > at this point? > > > > I suspect to limit the window a "spurious" interrupt would > > get sent by the device. If we are protecting from that > > case > > anyway, I think you are probably right that it's ok to > > drop the flush. > > Ok. > > > > > > > > The part that I don't understand is how intercpu > > synchronization > > > is achieved in irq rebalancing case, device deinit > > case, etc. > > > > It's been 5+ years since I've look at the IRQ > > "rebalancing" > > (interrupt migration) sequence. I don't recall enough to be > > useful. > > > > Device devinit should generate NO interrupts - that's a bug > > in > > the device driver if the device is generating interrupts > > before > > the IRQ or MSI is initialized and can handle them. > > No, what I meant is that if driver is unloading, driver does > free_irq(), but a laggard device interrupt still creeps in > after free_irq() is complete. > > > > > Device driver module unload is the other time I've seen > > ugly > > race conditions (between interrupts and DMA unmapping > > mostly). > > > > > Does Linux currently take a strict approach to > > barriers in > > > these cases, or is it a loose approach where a laggard > > instance > > > of the interrupt can still creep to an unintended cpu? > > In the > > > loose approach case, the readback does not make much > > sense to > > > me. > > > > I don't know offhand. In general, I believe stricter > > approaches help > > keep the systems more robust in the face of other > > issues. Race conditions > > around interrupt handling are pretty hard to debug. > > No arguments; the point here is whatever Linux does, there > are probably devices which will send out interrupts when > they shouldn't. Maybe .001% of the time under specific > loads. > > > Removing this > > MMIO read needs more careful analysis than I'm willing to > > put into it. > > Both to justify the effort (some measurable performance > > benefit) and > > "prove" it will work under all conditions. > > I think removing a (unrequired?) pio read at every interrupt > should show measurable improvements (even with intr > coalescing/moderation). You are right though, its a significant > effort, across various platform arches and device drivers. > > I began the thread with a pointer to the patch that introduced > the pio read 3 years back; I was hoping the relevant folks > that pushed the patch would contribute to this discussion. > Unfortunately, for something this subtle, its much harder to > rationalize eliminating it after this long. > > Kanoj > > > > > hth, > > grant > > > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html