--- On Thu, 8/26/10, Grant Grundler <grundler@xxxxxxxxxxxxxxxx> wrote: > From: Grant Grundler <grundler@xxxxxxxxxxxxxxxx> > Subject: Re: Linux mask_msi_irq() question > To: "Kanoj Sarcar" <kanojsarcar@xxxxxxxxx> > Cc: "Grant Grundler" <grundler@xxxxxxxxxxxxxxxx>, "Jesse Barnes" <jbarnes@xxxxxxxxxxxxxxxx>, linux-pci@xxxxxxxxxxxxxxx > Date: Thursday, August 26, 2010, 10:20 PM > On Wed, Aug 25, 2010 at 12:40:00AM > -0700, Kanoj Sarcar wrote: > ... > > > Yes, it will make sure one can mask MSI and not > drop the > > > MSI signal. > > > I don't believe it provides any sort of barrier. > In flight > > > MSIs > > > still need to be dealt with on the host side. > > > > Agreed about in flight msi's: chipset/platform > specific code > > needs to synchronize across multiple cpu's if there is > a > > strict guarantee requirement. > > > > But is it acceptable to have the device send out an > msix > > after having responded back to the host's entry mask > read? > > No. I agree that violates the portions of the PCIE spec > already quoted in this thread. Just to be careful, I quoted what I think are relevant parts of the spec, but also expressed my confusion regarding whether those parts require an interrupt barrier to be provided by devices or not. I don't have a conviction either way on this; practically, I think there will be devices that do provide barriers, and some that don't. > > > ... > > In general, I agree. I am under the assumption though > that > > Linux handles misbehaving devices that generate > spurious > > interrupts by turning them off after a while. > > MSI is the equivalent to "Edge Triggered". Another > interrupt > should not be sent until the previous one is ACK'd. > IIRC, the "spurious" interrupt code waits until 100K > interrupts > have been sent (or something in that order of magnitude). > The point I was trying to make is that there is code to protect platform/kernel against errant devices, so that crash, NMI's etc do not happen. > > > So, I think > > each port is taking care of handling such interrupts > by > > potentially maintaining state whether the vector > should > > trigger or not, which is checked before invoking > driver isr. > > Ok - the generic interrupt code is providing that. That > is exactly what I meant needs to be handled. Agreed. > > I was thinking the context is some random driver is trying > to mask a device's MSI for some reason. > > > > > > > > > I suspect any code that mucks with interrupt > state would > > > also depend > > > on the interrupt not triggering at that moment. > > > > > > > At least from the part that I understand, I agree to > the above. > > When x86 receives an msix, it masks the entry and > reads the > > mask back on the interrupt cpu. The interrupt is not > > triggering at this point. Which begs the question: why > readback > > at this point? > > I suspect to limit the window a "spurious" interrupt would > get sent by the device. If we are protecting from that > case > anyway, I think you are probably right that it's ok to > drop the flush. Ok. > > > > The part that I don't understand is how intercpu > synchronization > > is achieved in irq rebalancing case, device deinit > case, etc. > > It's been 5+ years since I've look at the IRQ > "rebalancing" > (interrupt migration) sequence. I don't recall enough to be > useful. > > Device devinit should generate NO interrupts - that's a bug > in > the device driver if the device is generating interrupts > before > the IRQ or MSI is initialized and can handle them. No, what I meant is that if driver is unloading, driver does free_irq(), but a laggard device interrupt still creeps in after free_irq() is complete. > > Device driver module unload is the other time I've seen > ugly > race conditions (between interrupts and DMA unmapping > mostly). > > > Does Linux currently take a strict approach to > barriers in > > these cases, or is it a loose approach where a laggard > instance > > of the interrupt can still creep to an unintended cpu? > In the > > loose approach case, the readback does not make much > sense to > > me. > > I don't know offhand. In general, I believe stricter > approaches help > keep the systems more robust in the face of other > issues. Race conditions > around interrupt handling are pretty hard to debug. No arguments; the point here is whatever Linux does, there are probably devices which will send out interrupts when they shouldn't. Maybe .001% of the time under specific loads. > Removing this > MMIO read needs more careful analysis than I'm willing to > put into it. > Both to justify the effort (some measurable performance > benefit) and > "prove" it will work under all conditions. I think removing a (unrequired?) pio read at every interrupt should show measurable improvements (even with intr coalescing/moderation). You are right though, its a significant effort, across various platform arches and device drivers. I began the thread with a pointer to the patch that introduced the pio read 3 years back; I was hoping the relevant folks that pushed the patch would contribute to this discussion. Unfortunately, for something this subtle, its much harder to rationalize eliminating it after this long. Kanoj > > hth, > grant > > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html