Re: [PATCH] i2c: imx: increase retries on arbitration loss

Francesco Dolcini <francesco@xxxxxxxxxx> · Fri, 16 Dec 2022 13:51:03 +0100

On Fri, Dec 16, 2022 at 01:23:29PM +0100, Primoz Fiser wrote:
> Hi all,
> 
> On 16. 12. 22 12:13, Uwe Kleine-König wrote:
> > On Fri, Dec 16, 2022 at 12:02:27PM +0100, Oleksij Rempel wrote:
> > > On Fri, Dec 16, 2022 at 11:41:08AM +0100, Primoz Fiser wrote:
> > > > Hi Marco,
> > > > 
> > > > On 16. 12. 22 10:45, Marco Felsch wrote:
> > > > > Hi Primoz,
> > > > > 
> > > > > On 22-12-16, Primoz Fiser wrote:
> > > > > > By default, retries value is set to 0 (no retries). Set retries to more
> > > > > > sensible value of 3 to allow i2c core to re-attempt transfer in case of
> > > > > > i2c arbitration loss (i2c-imx returns -EAGAIN errno is such case).
> > > > > 
> > > > > apart the fact that the number of retries vary a lot and so the client
> > > > > driver behaviour can vary a lot which is not good IMHO, why do you think
> > > > > that 3 is a sufficient number?
> > > > 
> > > > IMHO it is better than leaving it at 0 (no retries)?
> > > > 
> > > > Setting it to sensible value like 3 will at least attempt to make transfer
> > > > in case arbitration-loss occurs.
> > > > 
> > > > > 
> > > > > If an arbitration loss happen, why do you think that retrying it 3 times
> > > > > changes that?
> > > > 
> > > > I our case, setting retries to non-zero value solves issues with PMIC
> > > > shutdown on phyboard-mira which in some rare cases fails with "Failed to
> > > > shutdown (err =  -11)" (-EAGAIN).
> > > > 
> > > > To me it makes common sense retries is set to non-zero value especially for
> > > > such rare conditions/situations.
> > > 
> > > https://lore.kernel.org/all/Ys1bw9zuIwWS+bqw@shikoro/
> 
> Ohh I see.
> 
> Reading through the thread I guess we aren't getting this mainlined at all
> :)
>
> The only solid point in the thread seems to be that in that case we are not
> covering up the potential i2c hardware issues?

I believe that in this case we should just have a warning in the kernel.
The retry potentially work-around a transient issue and we do not hide any hardware
issue at the same time. It seems an easy win-win solution.

> Yeah fair point but on the other hand, goal of this patch would be to
> improve robustness in case of otherwise good performing hardware. From user
> perspective I just want it to work despite it retrying under the hood from
> time to time. I think Francesco had the same idea.

Unfortunately I was missing the exact background that made us do this
change, we just had it sitting in our fork for too long :-/
This is one of the reason I gave up on it.

Quoting Uwe [1]
> sometimes there is no practical way around such work arounds. I happens
> from time to time that the reason for problem is known, but fixing the
> hardware is no option, then you need such workrounds. (This applies to
> both, retrying the transfers and resetting the bus.) 

Francesco

[1] https://lore.kernel.org/all/20220715083400.q226rrwxsgt4eomp@xxxxxxxxxxxxxx/