[patch 2.6.23-rc6] lm75 should handle I/O errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi David,

On Mon, 24 Sep 2007 13:09:00 -0700, David Brownell wrote:
> > On Tue, 18 Sep 2007 09:51:07 -0700, David Brownell wrote:
> > > When talking to a TMP75 chip with a flakey I2C adapter (it easily gets
> > > underrun errors on loaded systems), I happened to notice that the LM75
> > > driver holds the delusion that I/O calls can never fail.
> >
> > Almost all I2C chip drivers do...
> 
> Oops.  I suppose that's partially explained by how rude the fault model
> in those calls is ... such as
> 
>  - normally returning "-1" instead of a real "what went wrong" code;
>  - can't say how many RX bytes are valid, in error-after-N-bytes cases;
>  - can't say how many TX bytes were sent before slave NAK
>  - after slave NAKs a TX, can't control whether to issue the next i2c_msg
>  - ...
> 
> So it's less realistic than usual to expect drivers to check for faults.

I'm not happy with this situation either. In particular your first
point is a pity and would be pretty easy to fix. Feel free to submit
patches.

> > > This patch corrects that by retrying a few times after errors, and then
> > > by refusing to record error codes as register values.  The retries seem
> > > to resolve the problem under light system loads.  Clearly they won't be
> > > able to resolve it in all cases, so the second mechanism is also needed.
> >
> > The patch looks correct and could be applied, however... While I agree
> > that the error handling mechanism belongs to the lm75 driver, the retry
> > mechanism doesn't. Duplicating this retry mechanism in all chip drivers
> > doesn't look good. Don't you think it would be better implemented at
> > i2c adapter level, or even better at i2c core level?
> 
> Nope.  It's not OK to retry all messages.  In *this* case, reads have no
> side effects (the TMP75 had SMBALERT# cleared using that SMBus patch),
> and the writes are idempotent ... so it's safe.  That's not true in general.
> (Simple counter-example:  messages may update counters, explicitly or
> implicitly.  Retries would bork the counters...)

Let me play the devil's advocate here. We retry only if a transaction
failed. If the transaction failed, how could it have any side effect?

> It's message-specific whether retrying is safe ... and context-specific
> whether it's desirable.  And by "message" I mean an entire array of
> "struct i2c_msg" instances, not an individual "i2c_msg".

I did not suggest retrying individual i2c_msg, that wouldn't make any
sense. But retrying arrays of i2c_msgs (aka transactions) sounds rather
safe to me. Do you have a real-world example where it is not, or are
you playing it safe just in case?

I guess that your point of view is the reason why i2c-algo-bit and
i2c-algo-pcf only retry if the address byte wasn't acked, and not after
that.

> > We have i2c_adapter.retries for that already, however it seems that the
> > code to handle retries is missing in most bus drivers. Only
> > i2c-algo-bit, i2c-algo-pcf, i2c-pxa and i2c-s3c2410 appear to implement
> > it, and not even the same way: the former two only retry if the address
> > byte is not acked, while the latter two retry on any failure.
> 
> No wonder this "retry" thing has always seemed like a puzzle to me!
> 
> I do know that when I tripped over algo-bit retrying, it seemed more
> buglike than not.  (Eight retries to establish that the alert response
> message had done its job?  Puhleeeze!)

Admittedly, if you are doing reads and expect them to fail (which is
also the case when we use zero-byte messages to probe for chip
presence) the retry mechanism is not so desirable and only slows things
down.

> > What about moving the retry mechanism to i2c-core (in i2c_smbus_xfer
> > and i2c_transfer)?
> 
> Retry logic certainly doesn't belong in the lowlevel (adapter) code.
> 
> If it needs to exist at all, I'd rather see it controlled by whoever
> issues the call.  Maybe an I2C_M_RETRIES mask for a two-bit field,
> defaulting to no retries, which could be assigned by the caller iff
> they know it's safe to retry ... though the i2c_smbus_*() calls
> have no way to pass such stuff.

There is a way: we can make this an i2c_client attribute or flag. just
like I2C_CLIENT_TEN sets I2C_M_TEN, I2C_CLIENT_RETRIES (2-bit flag)
could set I2C_M_RETRIES.

> It'd be best IMO if the retry mechanism were just removed.

If you have strong arguments for that, just send a patch.

-- 
Jean Delvare




[Index of Archives]     [Linux Kernel]     [Linux Hardware Monitoring]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]

  Powered by Linux