Re: Linux mask_msi_irq() question

Kanoj Sarcar <kanojsarcar@xxxxxxxxx> · Fri, 27 Aug 2010 02:03:05 -0700 (PDT)

--- On Thu, 8/26/10, Grant Grundler <grundler@xxxxxxxxxxxxxxxx> wrote:

> From: Grant Grundler <grundler@xxxxxxxxxxxxxxxx>
> Subject: Re: Linux mask_msi_irq() question
> To: "Kanoj Sarcar" <kanojsarcar@xxxxxxxxx>
> Cc: "Grant Grundler" <grundler@xxxxxxxxxxxxxxxx>, "Jesse Barnes" <jbarnes@xxxxxxxxxxxxxxxx>, linux-pci@xxxxxxxxxxxxxxx
> Date: Thursday, August 26, 2010, 10:20 PM
> On Wed, Aug 25, 2010 at 12:40:00AM
> -0700, Kanoj Sarcar wrote:
> ...
> > > Yes, it will make sure one can mask MSI and not
> drop the
> > > MSI signal.
> > > I don't believe it provides any sort of barrier.
> In flight
> > > MSIs
> > > still need to be dealt with on the host side.
> > 
> > Agreed about in flight msi's: chipset/platform
> specific code
> > needs to synchronize across multiple cpu's if there is
> a
> > strict guarantee requirement.
> > 
> > But is it acceptable to have the device send out an
> msix
> > after having responded back to the host's entry mask
> read?
> 
> No. I agree that violates the portions of the PCIE spec
> already quoted in this thread.

Just to be careful, I quoted what I think are relevant parts
of the spec, but also expressed my confusion regarding whether
those parts require an interrupt barrier to be provided by
devices or not. I don't have a conviction either way on this;
practically, I think there will be devices that do provide
barriers, and some that don't.

> 
> 
> ...
> > In general, I agree. I am under the assumption though
> that
> > Linux handles misbehaving devices that generate
> spurious 
> > interrupts by turning them off after a while.
> 
> MSI is the equivalent to "Edge Triggered". Another
> interrupt
> should not be sent until the previous one is ACK'd.
> IIRC, the "spurious" interrupt code waits until 100K
> interrupts
> have been sent (or something in that order of magnitude).
> 

The point I was trying to make is that there is code to
protect platform/kernel against errant devices, so that
crash, NMI's etc do not happen.

> 
> > So, I think
> > each port is taking care of handling such interrupts
> by
> > potentially maintaining state whether the vector
> should
> > trigger or not, which is checked before invoking
> driver isr.
> 
> Ok - the generic interrupt code is providing that. That
> is exactly what I meant needs to be handled.

Agreed.

> 
> I was thinking the context is some random driver is trying
> to mask a device's MSI for some reason.
> 
> > 
> > > 
> > > I suspect any code that mucks with interrupt
> state would
> > > also depend
> > > on the interrupt not triggering at that moment.
> > > 
> > 
> > At least from the part that I understand, I agree to
> the above.
> > When x86 receives an msix, it masks the entry and
> reads the
> > mask back on the interrupt cpu. The interrupt is not
> > triggering at this point. Which begs the question: why
> readback
> > at this point?
> 
> I suspect to limit the window a "spurious" interrupt would
> get sent by the device. If we are protecting from that
> case
> anyway, I think you are probably right that it's ok to
> drop the flush.

Ok.

> 
> 
> > The part that I don't understand is how intercpu
> synchronization
> > is achieved in irq rebalancing case, device deinit
> case, etc. 
> 
> It's been 5+ years since I've look at the IRQ
> "rebalancing"
> (interrupt migration) sequence. I don't recall enough to be
> useful.
> 
> Device devinit should generate NO interrupts - that's a bug
> in
> the device driver if the device is generating interrupts
> before
> the IRQ or MSI is initialized and can handle them.

No, what I meant is that if driver is unloading, driver does
free_irq(), but a laggard device interrupt still creeps in
after free_irq() is complete.

> 
> Device driver module unload is the other time I've seen
> ugly
> race conditions (between interrupts and DMA unmapping
> mostly).
> 
> > Does Linux currently take a strict approach to
> barriers in 
> > these cases, or is it a loose approach where a laggard
> instance
> > of the interrupt can still creep to an unintended cpu?
> In the
> > loose approach case, the readback does not make much
> sense to
> > me.
> 
> I don't know offhand. In general, I believe stricter
> approaches help
> keep the systems more robust in the face of other
> issues.  Race conditions
> around interrupt handling are pretty hard to debug. 

No arguments; the point here is whatever Linux does, there
are probably devices which will send out interrupts when
they shouldn't. Maybe .001% of the time under specific
loads.

> Removing this
> MMIO read needs more careful analysis than I'm willing to
> put into it.
> Both to justify the effort (some measurable performance
> benefit) and
> "prove" it will work under all conditions.

I think removing a (unrequired?) pio read at every interrupt
should show measurable improvements (even with intr
coalescing/moderation). You are right though, its a significant
effort, across various platform arches and device drivers.

I began the thread with a pointer to the patch that introduced
the pio read 3 years back; I was hoping the relevant folks 
that pushed the patch would contribute to this discussion.
Unfortunately, for something this subtle, its much harder to
rationalize eliminating it after this long.

Kanoj

> 
> hth,
> grant
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html