Re: [PATCH] i2c: omap: fix IRQ storms

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am Thu, 27 Feb 2025 08:20:55 -0600
schrieb Nishanth Menon <nm@xxxxxx>:

> On 10:08-20250220, Andreas Kemnade wrote:
> > Am Wed, 19 Feb 2025 20:22:13 +0100
> > schrieb Andi Shyti <andi.shyti@xxxxxxxxxx>:
> >   
> > > Hi,
> > > 
> > > On Fri, Feb 07, 2025 at 07:54:35PM +0100, Andreas Kemnade wrote:  
> > > > On the GTA04A5 writing a reset command to the gyroscope causes IRQ
> > > > storms because NACK IRQs are enabled and therefore triggered but not
> > > > acked.
> > > > 
> > > > Sending a reset command to the gyroscope by
> > > > i2cset 1 0x69 0x14 0xb6
> > > > with an additional debug print in the ISR (not the thread) itself
> > > > causes
> > > > 
> > > > [ 363.353515] i2c i2c-1: ioctl, cmd=0x720, arg=0xbe801b00
> > > > [ 363.359039] omap_i2c 48072000.i2c: addr: 0x0069, len: 2, flags: 0x0, stop: 1
> > > > [ 363.366180] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x1110)
> > > > [ 363.371673] omap_i2c 48072000.i2c: IRQ (ISR = 0x0010)
> > > > [ 363.376892] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> > > > [ 363.382263] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> > > > [ 363.387664] omap_i2c 48072000.i2c: IRQ LL (ISR = 0x0102)
> > > > repeating till infinity
> > > > [...]
> > > > (0x2 = NACK, 0x100 = Bus free, which is not enabled)
> > > > Apparently no other IRQ bit gets set, so this stalls.
> > > > 
> > > > Do not ignore enabled interrupts and make sure they are acked.
> > > > If the NACK IRQ is not needed, it should simply not enabled, but
> > > > according to the above log, caring about it is necessary unless
> > > > the Bus free IRQ is enabled and handled. The assumption that is
> > > > will always come with a ARDY IRQ, which was the idea behind
> > > > ignoring it, proves wrong.
> > > > It is true for simple reads from an unused address.
> > > > 
> > > > So revert
> > > > commit c770657bd261 ("i2c: omap: Fix standard mode false ACK readings").
> > > > 
> > > > The offending commit was used to reduce the false detections in
> > > > i2cdetect. i2cdetect warns for confusing the I2C bus, so having some
> > > > rare false detections (I have never seen such on my systems) is the
> > > > lesser devil than having basically the system hanging completely.
> > > > 
> > > > No more details came to light in the corresponding email thread since
> > > > several months:
> > > > https://lore.kernel.org/linux-omap/20230426194956.689756-1-reidt@xxxxxx/
> > > > so no better fix to solve both problems can be developed right now.    
> > > 
> > > I need someone from TI or someone who can test to ack here.
> > > 
> > > Can someone help?
> > >  
> > The original (IMHO minor) problem which should be fixed by c770657bd261
> > is hard to test, I have never seen that on any system (and as a
> > platform maintainer have a bunch of them) I have access to.
> > There is not much description anywhere about the system in which the
> > original system occured, and no reaction since several months from the
> > author, so I do not see anything which can be done.
> > Maybe it was just faulty hardware.
> > 
> > As said in the commit message, reverting it should be the lesser devil.
> > And that state was tested for many years.  
> 
> Can we not handle this slightly differently? leave the fix based on
> compatible? we know that the i2c controller changed over time. the
> i2cdetect bug fixed by c770657bd261 esp hard to find and fix.
> 
looking a bit more deeper in:
Why do we have omap_i2c_isr at all? Can there any case that
stat & mask == 0 there (without c770657bd261 applied)?

I looked at omap_i2c_xfer_data() and nothing interesting seems to
happen without other bits besides OMAP_I2C_STAT_NACK. 
Looking again, things get interesting when that loop is left.

Maybe just acking NACK, setting cmd_err and return -EAGAIN if no other
bits are set. That should not cause changes to scenarios where NACK
comes with other bits set. Lets check whether that fixes the
mess I see here. Well, everything is better then having that IRQ going
mad.

For reference, the sensor involved was the BMG160. Because it is not
enabled in omap2plus_defconfig, the issue did not show up early.

[Index of Archives]     [Linux GPIO]     [Linux SPI]     [Linux Hardward Monitoring]     [LM Sensors]     [Linux USB Devel]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux