Re: [PATCH 1/2] iio: hrtimer-trigger: Mark hrtimer to expire in hard interrupt context

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 14 Aug 2020 11:30:08 +0100
Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote:

> On Fri, 14 Aug 2020 07:24:52 +0200
> Lars-Peter Clausen <lars@xxxxxxxxxx> wrote:
> 
> > On 8/13/20 4:55 PM, Jonathan Cameron wrote:  
> > > On Thu, 13 Aug 2020 14:19:57 +0200
> > > Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> > >    
> > >> Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> writes:    
> > >>> On 2020-08-13 11:46:30 [+0200], Lars-Peter Clausen wrote:    
> > >>>> If you are running with forced IRQ threads the only thing that will then
> > >>>> happen in the actual hard IRQ context is the launching of the IRQ threads.
> > >>>> Th e IRQ handler of the device driver will run in a threaded IRQ.    
> > >>> So if it is really just the wakeup of the IRQ-thread then it should be
> > >>> okay.
> > >>> One thing: iio_trigger_poll() may invoke iio_trigger_notify_done(). This
> > >>> would invoke trig->ops->try_reenable callback if available.
> > >>> I grepped and found
> > >>> - bma180_trig_try_reen()
> > >>>    It appears to perform i2c_smbus_read_byte_data() and smbus sounds
> > >>>    sleeping. I don't know if it attempts to acquire any spinlock_t but it
> > >>>    will be wrong on RT.    
> > >> It's wrong even on !RT. i2c reads cannot be invoked from hard interrupt
> > >> context.
> > >>    
> > > We would hit this (and resulting warnings) all the time if it actually
> > > happened, so my suspicion is that it doesn't.
> > >
> > > I think the path doesn't actually exist although it looks at first glance like it does.
> > >
> > > The interrupt can only be enabled if there is someone using the trigger.
> > > Thus usecount will be non zero and for at least one element
> > > trig->subirq[i].enabled == true
> > >
> > > So we will decrement trig->usecount in the call to iio_trigger_notify_done
> > > but never reach 0 thus the call to trig->ops->try_reenable never happens
> > > in the hard interrupt context.    
> > 
> > I think there is a race condition here. If a consumer is disabled 
> > concurrently with iio_trigger_poll() there is a chance that `enabled` is 
> > false for all consumers.
> > 
> > The odds of this happening are very low, but there is nothing that 
> > prevents it.   
> 
> A consumer disable should have previously resulted in the trigger stopping
> (or do we have he ordering wrong there?)  Might be worth checking that, but
> see below, as I think we can stop it having harmful effects anyway.
> 
> >   
> > >
> > > It does happen later when which ever driver we triggered finishes the
> > > threaded part of it's handler and calls iio_trigger_notify_done, but that
> > > is fine.
> > >
> > > Assuming people agree with my analysis it would be good to make it explicit
> > > that we cannot hit the problem path.
> > >
> > > Perhaps call a new iio_trigger_notify_no_needed() that simply does
> > > the decrement without test, or does it with test and spits out a
> > > warning if we hit 0.    
> > 
> > I think we need to re-think the whole try_reenable() functionality. 
> > Looking at it I think there are more issues here.
> > 
> > For example lets say we call iio_trigger_notify_done() from the threaded 
> > handler and try_reenable() returns true. We'd now call 
> > iio_trigger_poll() from the threaded context, which is wrong.  
> 
> The history behind that functionality was some early devices which were using
> level interrupts tied to edge interrupts.  One of our earliest drivers had to
> do a dance with checking a gpio to avoid stalling.   That went away a long
> time ago when we basically concluded that platform was too broken to support
> interrupts on.
> 
> Probably would have helped if try_reenable returned a bool as then people wouldn't
> have returned error codes via it which is completely wrong as it stands (though
> handy given what I suggest below)
> 
> From a quick look.
> bma180, bmc150, kxcjk1013, mxc4005m bmg180, kmx61, bmc150
> all only fail if an error occurs in an bus access. That's a bug.
> 
> at91 never returns anything other than 0.
> 
> So I think we can probably rip the iio_trigger_poll call out entirely
> and replace it with an error message, or push an error message into
> the drivers and drop the check of the value of try_reenable.
> 
> > 
> > There is also the issue that iio_trigger_poll() effectively can end up 
> > calling itself recursively indefinitely via the 
> > iio_trigger_notify_done() chain, which might not be the best thing in 
> > hard IRQ context.
> >   
> 
> :) Indeed unwise.  I think the only driver that could in theory do this
> that we had a long time back had a counter in the try_reen to give
> up if it happened more than a few times.
>

A quick update on this one. I've sent out a fix for the try_reenable()
patch.  Short version is that we no longer let it fail as no one correctly
used that path anyway anymore.

https://lore.kernel.org/linux-iio/20200920132548.196452-2-jic23@xxxxxxxxxx/T/#u

My understanding is that resolves the open question on this series.
I'd like to get both issues resolved, so if people have a chance to look
that would be great.

Thanks,

Jonathan
  
> Jonathan
> 
> 




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Input]     [Linux Kernel]     [Linux SCSI]     [X.org]

  Powered by Linux