--- On Mon, 8/23/10, Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> wrote: > From: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx> > Subject: Re: Linux mask_msi_irq() question > To: "Kanoj Sarcar" <kanojsarcar@xxxxxxxxx> > Cc: linux-pci@xxxxxxxxxxxxxxx > Date: Monday, August 23, 2010, 9:48 AM > > > From: Kanoj Sarcar <kanojsarcar@xxxxxxxxx> > > > Subject: Linux mask_msi_irq() question > > > To: mitch.a.williams@xxxxxxxxx, > tom.l.nguyen@xxxxxxxxx, > mingo@xxxxxxxxxx > > > Cc: kanojsarcar@xxxxxxxxx > > > Date: Friday, August 13, 2010, 12:30 AM > > > Hello, > > > > > > I have a question on msix vector masking, and was > hoping > > > one of > > > you could answer, instead of posting this > question on one > > > of the lists. > > > > > > mask_msi_irq() is doing a readback of the vector > mask after > > > masking > > > an entry. I tried to dig up the history on this, > and came > > > across > > > Mitch's patch from Mar 2007 against 2.6.21 where > he > > > implemented the > > > readback/flush during enable/disable operations: > > > http://marc.info/?l=linux-kernel&m=117459742025894&w=2 > > > > > > In 2.6.30, I see that even mask/unmask is doing > the flush > > > (arch/x86/kernel/apic/io_apic.c chip handlers use > the > > > function). > > > > > > Now the question: is it truly guaranteed from > PCI/PCIE > > > and/or > > > MSIX specs that the memory read/flush indeed will > provide a > > > strong > > > interrupt reception barrier? Or is it that some > specific > > > devices > > > end up providing this guarantee above and beyond > PCI/MSIX > > > specs? > > > > > > Thank you for any responses. > > Hi Kanoj, it's been awhile! (Assuming you're the same > Kanoj Sarcar I > knew at SGI who did some of the early Origin/Itanium/NUMA > work.) > > My memory of the spec is that this ordering *is* > guaranteed, but that > some boxes violate it (e.g. Altix & Origin). We > jumped through some > hoops to make sure the readX functions did flush out > interrupts by > adding a barriered DMA read operation to the non-relaxed > variants. > > Unfortunately I just put away my Mindshare book for a move > this week so > I don't have it handy, maybe someone else can look up the > appropriate > section and make sure. > > However, there is some ordering like PIO vs MMIO that's not > guaranteed > at all in the spec. I think Ben is running into this > on one of his > platforms right now. > > -- > Jesse Barnes, Intel Open Source Technology Center > Hi Jesse, Good to hear back from you! Yes, its been a while since SGI. I did go over the specs some, two points of reference: a. PCIE base spec rev 3.0 version .71 released May 25, 2010: section 2.4.1 table D2a ensures posted write (such as msix write issued by device) does not pass read completion issued by device (such as read-completion for MSIX entry mask). b. MSIX ECN comments (section 6.8): "An MSI-X vector is masked when its associated MSI-X Table entry Mask bit or the MSI-X Function Mask bit is set. While a vector is masked, the function is prohibited from sending the associated message, and the function must set the associated Pending bit whenever the function would otherwise send the message." Given these two, roughly the host action of "write entry mask"; "read entry mask" _should_ apparently provide a interrupt barrier. But if you wanted to play the devil's advocate, in the comment b. above, "While a vector is masked" is not clearly defined; IE if the host does a pio write to mask, then reads back the mask (which is to a certain extent orthogonal to device noticing and acting on the mask change), does MSIX spec actually require the device to provide an interrupt barrier? Like you mention, there are already chipset issues anyway. Also, what happens if the flushing read is deleted? Since this is being invoked on every MSIX reception on host (at least on x86[_64]), it is rather a costly operation. IE, if the read is removed, and an interrupt does creep in, what are the problems? Kernel panic, NMI etc? IE does some piece of kernel code (irq rebalance etc) actually rely on the barrier? Thanks. Kanoj -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html