Re: i2c errors while updating sensors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Guenter,

Thanks for your reply!
I didn't get to implementing this before the Xmas holiday, but have picked it up today.
(btw. All the best for 2012 for all readers).

See below for some remarks.

2011/12/23 Guenter Roeck <guenter.roeck@xxxxxxxxxxxx>
Hi Frans,

On Fri, Dec 23, 2011 at 09:44:35AM -0500, Frans Meulenbroeks wrote:
> I was peeking at lm75/lm80 i2c error message handing.
>
> The problem I am facing is that I am using lm75 for temperature monitoring and
> lm80 for temperature and fan tacho monitoring.
> This is an embedded (powerpc) system using a 2.6.38 kernel, and accessing the
> sensors through the sysfs interface.
>
> The customer wants to be able to detect if someone accidently disconnects the
> fan (or the monitoring chip dies).
> I know i2c is not hot pluggable, but would have expected to be able to detect
> this error condition
>
> However, from peeking at the sources, it seems I am not.
>
>
> lm75.c says (function lm75_update_device)
>         for (i = 0; i < ARRAY_SIZE(data->temp); i++) {
>             int status;
>
>             status = lm75_read_value(client, LM75_REG_TEMP[i]);
>             if (status < 0)
>                 dev_dbg(&client->dev, "reg %d, err %d\n",
>                         LM75_REG_TEMP[i], status);
>             else
>                 data->temp[i] = status;
>         }
>         data->last_updated = jiffies;
>         data->valid = 1;
> So there is an error message if dbg is on, but a user will still get the old
> data as temp[i] is not updated.
>
> Wouldn't it be nice to e.g. add a (bool) attribute data-valid or or so and drop
> that to 0 on a read fail (or maybe 3 subsequent read fails)?

data->valid typically already exists.

> Or is there a better way do detect such a problem in user space (e.g. by
> putting an "error" value in the var)
>
> I've also peeked at lm80.c
> Here the situation seems to be a little bit worse.
> E.g. I see:
>
> static int lm80_read_value(struct i2c_client *client, u8 reg)
> {
>     return i2c_smbus_read_byte_data(client, reg);
> }
>
> ...
>
>         data->temp =
>             (lm80_read_value(client, LM80_REG_TEMP) << 8) |
>             (lm80_read_value(client, LM80_REG_RES) & 0xf0);
>
> but i2c_smbus_read_byte_data may return a negative value upon error.
> To me it seems this is not going to return a sensible value in case the read
> fails (and also not e.g. something like -1)
>
> Is this desired? Should this be fixed? Or is there a better way to do so?

No, yes, and yes.

Ok, I'll bite and give it a stab.

Note that returning -1 is not a good idea; one should pass on the error code.

> If needed, I can have a stab at the code to fix it, but in that case I would
> appreciate some guidance in the direction of the preferred solution (so my work
> is not rejected because it was felt a faulty solution).
>
There are multiple options. The easiest would be to set data->valid to 0 and return
the error from the _update_device() function. So you would have

       if (status < 0) {
               data->valid = 0;
               return status;
       }

in the update_device() function, and

       struct lm75_data *data = "">
       if (IS_ERR(data))
               return PTR_ERR(data);

       ...

in the _get() function.

Downside is that each read will return an error, even if the read failed on an attribute
which is not currently read. ltc4261.c is an example for this kind of handling.

Thanks. I've implemented this in the same way as ltc4261.c did.
I will supply a patch in a short while.
Not too sure what kernel git I should use to base this patch on.
I will be using git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git as that is listed as the hwmon staging tree. Hope that is ok.

I also noticed that ltc4261 is not setting data->valid to 0 in case of an error. This seems a good addition so I'll add a patch for that.

Another approach would be to store the actual error in the temp[] array, and make it an int
instead of u16/u8.

       int temp[3];

       ...

       data->temp[i] = lm75_read_value(client, LM75_REG_TEMP[i]);

       ...

       int temp = LM75_TEMP_FROM_REG(data->temp[attr->index]);

       if (temp < 0)
               return temp;
       return sprintf(buf, "%d\n", LM75_TEMP_FROM_REG(temp));

More effort and more code, but more accurately reports errors. With this approach,
you would have to make sure that u8/u16 <-> int conversions don't introduce errors;
the arithmetic in some drivers depends on the type of the storage variable.
Example for this kind of solution is max16065.c.

Which approach to use really depends on the chip and on the amount of effort one wants
to make. I typically select the first approach for simple chips, and use the second
approach if there is a possibility that the chip returns an error on one read but not
on others. The latter is mostly a concern for microcontroller based i2c devices.

For lm75, the first method should be ok. For lm80, you could use both, depending how much
time you want to spend on it. If there is a chance that one read returns an error while
other reads don't, the second approach would be much better.

As lm80 is not a micro based i2c dev, I would suspect that a fail apply to all register reads.
And if a register read fails, even if it is not the register you are interested in at that moment, it is probably interesting to know as there is definitely a problem with the device/device communication
 

If you want to start playing with the lm80 driver, it would be nice if you could also
fix the checkpatch errors and warnings. That should be a separate patch, though.

Will peek into that too. Will be after the lm80 error patch (as I already made that one it is more efficient to do the checkfile later)
 

Thanks,
Guenter

Best regards, Frans
_______________________________________________
lm-sensors mailing list
lm-sensors@xxxxxxxxxxxxxx
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

[Index of Archives]     [Linux Kernel]     [Linux Hardware Monitoring]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]

  Powered by Linux