Re: [PATCH] iio: adc: ti_am335x_adc: fix fifo overrun recovery

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sam, 2017-03-11 at 18:07 +0000, Jonathan Cameron wrote:
> On 10/03/17 13:57, Michael Engl wrote:
> > 
> > The tiadc_irq_h(int irq, void *private) function is handling FIFO
> > overruns by clearing flags, disabling and enabling the ADC to
> > recover.
> > 
> > If the ADC is running in continuous mode a FIFO overrun happens
> > regularly. If the disabling of the ADC happens concurrently with
> > a new conversion. It might happen that the enabling of the ADC
> > is ignored by the hardware. This stops the ADC permanently. No
> > more interrupts are triggered.
> > 
> > According to the AM335x Reference Manual (SPRUH73H October 2011 -
> > Revised April 2013 - Chapter 12.4 and 12.5) it is necessary to
> > check the ADC FSM bits in REG_ADCFSM before enabling the ADC
> > again. Because the disabling of the ADC is done right after the
> > current conversion has been finished.
> The relevant comment in the latest version of that manual (rev O)
> is at the end of 12.3.5
> "The FIFO can also generate FIFOx_OVERRUN and FIFOx_UNDERFLOW
> interrupts. The user can mask
> these events by programming the IRQENABLE_CLR register. To clear a
> FIFO underflow or FIFO overrun
> interrupt, the user should write a ‘1’ to the status bit in the
> IRQSTS register. The TSC_ADC_SS does not
> recover from these conditions automatically. Therefore, the software
> will need to disable and then again
> enable the TSC_ADC_SS. Before the user can turn the module back on,
> the user must read the
> ADCSTAT register to check if the status of the ADC FSM is idle and
> the current step is the idle step."
> 
> 
> 
> > 
> > 
> > To trigger this bug it is necessary to run the ADC in continuous
> > mode. The ADC values of all channels need to be read in an endless
> > loop. The bug appears within the first 6 hours (~5.4 million
> > handled FIFO overruns). The user space application will hang on
> > reading new values from the character device.
> Impressive to track this one down, but it's the sort of nasty
> hardware race that we need to be very careful with when trying
> to get a fix in place.
> > 
> > 
> > Signed-off-by: Michael Engl <michael.engl@xxxxxxxxxxxxxxxxx>
> > ---
> >  drivers/iio/adc/ti_am335x_adc.c | 13 ++++++++++++-
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iio/adc/ti_am335x_adc.c
> > b/drivers/iio/adc/ti_am335x_adc.c
> > index ad9dec3..4282cec 100644
> > --- a/drivers/iio/adc/ti_am335x_adc.c
> > +++ b/drivers/iio/adc/ti_am335x_adc.c
> > @@ -169,7 +169,9 @@ static irqreturn_t tiadc_irq_h(int irq, void
> > *private)
> >  {
> >  	struct iio_dev *indio_dev = private;
> >  	struct tiadc_device *adc_dev = iio_priv(indio_dev);
> > -	unsigned int status, config;
> > +	unsigned int status, config, adc_fsm;
> > +	unsigned short count = 0;
> > +
> >  	status = tiadc_readl(adc_dev, REG_IRQSTATUS);
> >  
> >  	/*
> > @@ -183,6 +185,15 @@ static irqreturn_t tiadc_irq_h(int irq, void
> > *private)
> >  		tiadc_writel(adc_dev, REG_CTRL, config);
> >  		tiadc_writel(adc_dev, REG_IRQSTATUS,
> > IRQENB_FIFO1OVRRUN
> >  				| IRQENB_FIFO1UNDRFLW |
> > IRQENB_FIFO1THRES);
> > +
> > +		/* wait for idle state.
> Standard multiline comment syntax preferred.
> > 
> > +		 * ADC needs to finish the current conversion
> > +		 * before disabling the module
> > +		 */
> > +		do {
> > +			adc_fsm = tiadc_readl(adc_dev,
> > REG_ADCFSM);
> > +		} while (adc_fsm != 0x10 && count++ < 100);
> What slightly bothers me is that we are papering over the
> problem,  yet
> this is far from guaranteed to fix it (I think!) as we might get a
> race
> as it starts the next conversion after we have checked, but before we
> disable...
> 
> Note I'm only looking at this fairly superficially so please point
> out if I have missed something!

Probably my comment is a bit confusing. The ADC has already been
disabled by writing the config some lines above the while loop. 

"
config &= ~(CNTRLREG_TSCSSENB);
tiadc_writel(adc_dev, REG_CTRL, config);
"
But as stated in the Reference Manual the ADC needs some time to finish
the current conversion to finally disable the ADC. 

Description of "Enable" Bit in the CTRL Register (table 12-14):
"TSC_ADC_SS module enable bit.
After programming all the steps and configuration registers, write a
1to this bit to turn on TSC_ADC_SS.
Writing a 0 will disable the module (after the current conversion)."

That means once the adc_fsm is equal to 0x10 (idle state) the ADC stays
disabled as long as we enable the ADC again. Which is done by the
original code one line below of this patch.

> 
> We could look for a transition, which would potentially reduce the
> chance
> of problems, but that's nasty as well and we'd need to disable
> interrupts
> probably to avoid any potential of a clash (for a non trivial amount
> of
> time...)
> > 
> > +
> >  		tiadc_writel(adc_dev, REG_CTRL, (config |
> > CNTRLREG_TSCSSENB));
> >  		return IRQ_HANDLED;
> >  	} else if (status & IRQENB_FIFO1THRES) {��.n��������+%������w��{.n�����{��(��)��jg��������ݢj����G�������j:+v���w�m������w�������h�����٥




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Input]     [Linux Kernel]     [Linux SCSI]     [X.org]

  Powered by Linux