Re: OMAP3430 spurious interrupts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Woodruff, Richard <r-woodruff2@xxxxxx> [080115 08:39]:
>  
> Hi,
> 
> > From: Tony Lindgren [mailto:tony@xxxxxxxxxxx] 
> 
> > Richard, can you also describe the purpose of the spurious interrupt?
> > Is it just an error on accessing the registers too soon 
> > before interrupt priority sorting is done, or something like that?
> 
> Hugh? I don't think there is a 'spurious' interrupt vector as such at
> the hardware.  I didn't fully resync after the last interrupt reorg so
> perhaps they snuck in some new software term.

Well I was wondering if the spurious interrupts had some meaning besides
being errors, such as a wake-up event instead of real interrupt :)

> So, to me all that is saying is you get an interrupt at your L1 PIC, you
> go to the device and nothing is there to clear (or you get an interrupt
> and nothing is showing active at your L1 PIC).

OK, thanks that clarifies it.

> Generally this is indicative of a bad device or incorrect device irq
> programming.  If you get to many of these things your system will shut
> down with a continuous flow of interrupts.  Hence the kernel thinks they
> are serious and might shut down your vector if it starts taking what it
> feels are too many.  Bypassing the safety check can cause you to miss
> problems.
> 
> -a- What I was referring to with 'posting', is as the memory attribute
> is marked now, if you clear your isr at the device near the return from
> the isr, the cpu might unmask the irq before the actual write or the
> effect of the write has occurred at the device.  This will result in a
> IRQ request at unmask time, but when sources are checked there will be
> none.
> 
> * Previously to synchronize better we had to put the barriers at the
> writes to the PIC as it by default had a buffered type.  However, other
> devices were strongly ordered, thus they were more safe.  If you recall
> Catalin also asked for patches to controllers to fix this on ARM11.
> Last time this spurious flared up it was because in open source
> resyncing these barriers were dropped.  The barriers just make sure the
> data has left the ARM.  However, it doesn't account for the rest of the
> path (device maps).  This is where having the correct attribute for
> devices is good.  Those devices then have to acknowledge back to the bus
> per their protocol.  We had internal mails on this and it turns out the
> ARM to OCP bridges protocol conversion bits complicate things so its not
> so intuitive.  Strongly ordered is the closest to what you might guess
> should happen.
>
> In the above the interrupt request will happen for a small amount of
> time until line is finally cleared.  Ignoring these types of bursts may
> be harmless, but it depends a bit of some irq handlers is called and it
> will need to not do anything bad to device state.  It surly wastes some
> cycles.

Sounds like it still should be dealt for arm linux in general. It may
be worth looking if Catalin already has some patches for that.

> -b- The other thing which is clear in the TRM is the bit about a false
> interrupt at priority sorting time. If you monkey with masks during
> sorting time you might get a false isr.  As all of the source are level
> assertive at the pic, a 2nd gratuitous ACK of the vector number would be
> a hack way of handling that case.  I'm not sure it happens that much in
> practice.  The recommended programming model is a MASK of all ISRs down,
> handle the source, then ACK, and unmask.  The Linux code path doesn't do
> this however, it only masks the 1 irq in play, acks the irq, then unmaks
> all the rest, then handles the device.  In the messing with the mask
> around the ack with out dropping the source this small sorting window
> opens up (assuming there are more irs coming in).

I guess you mean mask each irq bank? Maybe we should try and see what
happens after solving -a- above first, then see if -b- is needed.

> So far interactions with the ISP camera driver seem to have caused the
> most spurious interrupts to occur.  However, you do see it with other
> drivers.  As that code has matured in our trees issues have been
> dropping off.

I guess Lauri was getting them in the serial driver.

Tony
-
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux