Re: [PATCH] iio: imu: st_lsm6dsx: fix edge-trigger interrupts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Mon, 2 Nov 2020 11:15:21 +0100
> Lorenzo Bianconi <lorenzo@xxxxxxxxxx> wrote:
> 
> > > On Thu, 22 Oct 2020 11:26:53 +0200
> > > Lorenzo Bianconi <lorenzo@xxxxxxxxxx> wrote:
> > >   
> > > > If the device is configured to trigger edge interrupts it is possible to
> > > > miss samples since the sensor can generate an interrupt while the driver
> > > > is still processing the previous one.
> > > > Poll FIFO status register to process all pending interrupts.
> > > > Configure IRQF_ONESHOT only for level interrupts.  
> > >   
> > 
> > Hi Jonathan,
> > 
> > thx for the review :)
> > 
> > > Hmm. This sort of case is often extremely prone to race conditions.
> > > I'd like to see more explanation of why we don't have one after this
> > > fix.  Edge interrupts for FIFOs are horrible!
> > > 
> > > Dropping IRQF_ONESHOT should mean we enter the threaded handler with
> > > interrupts enabled, but if another one happens we still have to wait
> > > for the thread to finish before we schedule it again.
> > > We should only do that if we disabled the interrupt in the top half,
> > > which we haven't done here (you are working around the warnings
> > > that would be printed with the otherwise pointless top half).  
> > 
> > looking at handle_edge_irq (please correct me if I am wrong) IRQF_ONESHOT
> > takes effect only for level interrupts while for edge-sensitive interrupts
> > the irq handler runs with the line unmasked. In fact the IRQF_ONESHOT part of
> > the patch seems not relevant for fixing the issue, I just aligned the code to
> > st_sensor general handling in st_sensors_allocate_trigger()
> > (https://elixir.bootlin.com/linux/v5.9.3/source/drivers/iio/common/st_sensors/st_sensors_trigger.c#L182).
> > I think the issue is a new interrupt can fire while we are still processing
> > the previous one if watermark is low (e.g. 1) and the sensor is running at high
> > ODR (e.g. 833Hz). Reading again the status register in st_lsm6dsx_handler_thread()
> > fixes the issue in my tests.
> > I guess we can just drop the IRQF_ONESHOT chunk and keep the while loop in
> > st_lsm6dsx_handler_thread(). What do you think?
> 
> I'd do that.

ack, I will post a v2 with only this change.

> 
> > 
> > > 
> > > I 'assume' that the interrupts are latched.  So we won't get a new
> > > interrupt until we have taken some action to clear it?  In this
> > > case that action is removing items from the fifo?  
> > 
> > I do not know :). Adding stm folks.
> > @mario, denis, armando: any pointer for this?
> > 
> > > 
> > > IIRC, if we get an interrupt whilst it is masked due to IRQF_ONESHOT
> > > then it is left pending until we exit the thread.  So that should
> > > be sufficient to close a potential edge condition where we clear
> > > the fifo, and it immediately fires again.  This pending behaviour
> > > is necessary to avoid the race that would happen in any normal handler.  
> > 
> > I did not get you on this point.
> 
> If an interrupt occurs, even whilst we have it masked, we shouldn't
> loose it.  If we did so then any normal handler that clears the interrupt
> at the end of doing whatever it needs to do would race against a new interrupt.
> 
> So my suspicion is that you aren't actually missing an interrupt, but rather the
> drop time is too short to be detected (or effectively not there at all).

I guess since edge interrupts run with the line unmasked a new interrupt can fire
while the irq thread is still running (so wake_up_process() will just return) but
the driver has already read fifo_status register and so it will not read new
sample. This case should be fixed reading again the fifo_status register.

Regards,
Lorenzo

> 
> > 
> > > 
> > > 
> > > Hmm. Having had a look at one of the datasheets, I'm far from convinced these
> > > parts truely support edge interrupts.  I can't see anything about minimum
> > > off periods etc that you need for true edge interrupts. Otherwise they are
> > > going to be prone to races.  
> > 
> > @mario, denis, armando: any pointer for this?
> > 
> > > 
> > > So I think the following can happen.
> > > 
> > > A) We drain the fifo and it stays under the limit. Hence once that
> > >    is crossed in future we will interrupt as normal.
> > > 
> > > B) We drain the fifo but it either has a very low watermark, or is
> > >    filling very fast.   We manage to drain enough to get the interrupt
> > >    to fire again, so all is fine if less than ideal.  With you loop we
> > >    may up entering the interrupt handler when we don't actually need to.
> > >    If you want to avoid that you would need to disable the interrupt,
> > >    then drain the fifo and finally do a dance to successfully reenable
> > >    the interrupt, whilst ensuring no chance of missing by checking it
> > >    should not have fired (still below the threshold)
> > > 
> > > C) We try to drain the fifo, but it is actually filling fast enough that
> > >    we never get it under the limit, so no interrupt ever fires.
> > >    With new code, we'll keep spinning to 0 so might eventually drain it.
> > >    That needs a timeout so we just give up eventually.
> > > 
> > > D) watershed is one sample, we drain low enough to successfully get down
> > >    to zero at the moment of the read, but very very soon after that we get
> > >    one sample again. There is a window in which the interrupt line dropped
> > >    but analogue electronics etc being what they are, it may not have been
> > >    detectable.  Hence we miss an interrupt...  What you are doing is reducing
> > >    the chance of hitting this.  It is nasty, but you might be able to ensure
> > >    a reasonable period by widening this window.  Limit the watermark to 2
> > >    samples?  
> > > 
> > > Also needs a fixes tag :)  
> > 
> > ack, I will add them in v2
> > 
> > Regards,
> > Lorenzo
> > >   
> > > > 
> > > > Signed-off-by: Lorenzo Bianconi <lorenzo@xxxxxxxxxx>
> > > > ---
> > > >  drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c | 33 +++++++++++++++-----
> > > >  1 file changed, 25 insertions(+), 8 deletions(-)
> > > > 
> > > > diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
> > > > index 5e584c6026f1..d43b08ceec01 100644
> > > > --- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
> > > > +++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
> > > > @@ -2457,22 +2457,36 @@ st_lsm6dsx_report_motion_event(struct st_lsm6dsx_hw *hw)
> > > >  	return data & event_settings->wakeup_src_status_mask;
> > > >  }
> > > >  
> > > > +static irqreturn_t st_lsm6dsx_handler_irq(int irq, void *private)
> > > > +{
> > > > +	return IRQ_WAKE_THREAD;
> > > > +}
> > > > +
> > > >  static irqreturn_t st_lsm6dsx_handler_thread(int irq, void *private)
> > > >  {
> > > >  	struct st_lsm6dsx_hw *hw = private;
> > > > +	int fifo_len = 0, len = 0;
> > > >  	bool event;
> > > > -	int count;
> > > >  
> > > >  	event = st_lsm6dsx_report_motion_event(hw);
> > > >  
> > > >  	if (!hw->settings->fifo_ops.read_fifo)
> > > >  		return event ? IRQ_HANDLED : IRQ_NONE;
> > > >  
> > > > -	mutex_lock(&hw->fifo_lock);
> > > > -	count = hw->settings->fifo_ops.read_fifo(hw);
> > > > -	mutex_unlock(&hw->fifo_lock);
> > > > +	/*
> > > > +	 * If we are using edge IRQs, new samples can arrive while
> > > > +	 * processing current IRQ and those may be missed unless we
> > > > +	 * pick them here, so let's try read FIFO status again
> > > > +	 */
> > > > +	do {
> > > > +		mutex_lock(&hw->fifo_lock);
> > > > +		len = hw->settings->fifo_ops.read_fifo(hw);
> > > > +		mutex_unlock(&hw->fifo_lock);
> > > > +
> > > > +		fifo_len += len;
> > > > +	} while (len > 0);
> > > >  
> > > > -	return count || event ? IRQ_HANDLED : IRQ_NONE;
> > > > +	return fifo_len || event ? IRQ_HANDLED : IRQ_NONE;
> > > >  }
> > > >  
> > > >  static int st_lsm6dsx_irq_setup(struct st_lsm6dsx_hw *hw)
> > > > @@ -2488,10 +2502,14 @@ static int st_lsm6dsx_irq_setup(struct st_lsm6dsx_hw *hw)
> > > >  
> > > >  	switch (irq_type) {
> > > >  	case IRQF_TRIGGER_HIGH:
> > > > +		irq_type |= IRQF_ONESHOT;
> > > > +		fallthrough;
> > > >  	case IRQF_TRIGGER_RISING:
> > > >  		irq_active_low = false;
> > > >  		break;
> > > >  	case IRQF_TRIGGER_LOW:
> > > > +		irq_type |= IRQF_ONESHOT;
> > > > +		fallthrough;
> > > >  	case IRQF_TRIGGER_FALLING:
> > > >  		irq_active_low = true;
> > > >  		break;
> > > > @@ -2520,10 +2538,9 @@ static int st_lsm6dsx_irq_setup(struct st_lsm6dsx_hw *hw)
> > > >  	}
> > > >  
> > > >  	err = devm_request_threaded_irq(hw->dev, hw->irq,
> > > > -					NULL,
> > > > +					st_lsm6dsx_handler_irq,
> > > >  					st_lsm6dsx_handler_thread,
> > > > -					irq_type | IRQF_ONESHOT,
> > > > -					"lsm6dsx", hw);
> > > > +					irq_type, "lsm6dsx", hw);
> > > >  	if (err) {
> > > >  		dev_err(hw->dev, "failed to request trigger irq %d\n",
> > > >  			hw->irq);  
> > >   
> > 
> 

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Input]     [Linux Kernel]     [Linux SCSI]     [X.org]

  Powered by Linux