On Fri, Aug 27, 2010 at 11:24:03AM -0400, Jean Delvare wrote: Hi Jean, > Hi Guenter, > > On Fri, 27 Aug 2010 06:49:26 -0700, Guenter Roeck wrote: > > Next question: lm90_update_device() currently does not return any errors. > > In recent drivers, we pass i2c read errors up to userland. Before I introduce > > the max6696 changes, does it make sense to add error checking/return > > into the driver, similar to what I have done in the smm665 and jc42 drivers ? > > So far, most hwmon driver authors decided to ignore such errors, or > limited their handling to logging the issue, mainly because the caching > mechanism makes handling of such errors tough. Now I admit that the > approach you took in the jc42 driver is interesting. I never considered > having a single error value being returned by the update function the > way you did. > > This has the obvious drawback that transient I/O errors cause _all_ > sensor values to be unavailable, which is discussable, especially for a > device with many features. It's hard to justify that all values of a > full-featured hardware monitoring chip could be unavailable because, > for example, one of the temperature sensors is unreliable. So this > approach is fine for your small jc42 driver, but I don't think it can be > generalized. > On the plus side, though, a transient failure only causes a single read operation to fail, since I don't update the timestamp nor the valid flag in the error case. As a result, the next read will again try to update all values. So it isn't really that bad. Only real drawback of my approach is that a transient read failure on one sensor register will likely be reported while trying to read data for another sensor. Of course, you are right that a permanent error on a single register will cause all sensor read operations to fail, which isn't really desirable. I have no idea if that can happen in the real world, though. Seems to be unlikely that a failing sensor would cause an I2C operation failure. But who knows - maybe it does happen with some chips. > In the general case, I think I am fine with pretty much anything which > doesn't plain ignore error codes (as many drivers still do...) and > doesn't block all readings on transient errors. This can mean returning > 0 on error, or returning the previous last known value (definitely > acceptable for transient errors, but not so for long-standing ones), Basic reason for returning errors in the first place was that I was asked to do so in review feedback for one of my drivers - specifically, that I should not drop errors. So we would need some clear(er) guidelines for new drivers if we want to go along that path. > with or without logging. Or if you really want to pass error codes down > to user-space, I think you have to rework the update() function and the > per-device data structure altogether, to be able to store error codes > in the data structure. > Seems to be a bit excessive, and it doesn't seem to be worth the effort and added complexity. > A different (and complementary) approach is to repeat the failing > command and see if it helps. The w83l785ts driver does exactly this. If > we want to generalize this, it would probably make sense to implement > it at the the i2c-core level (i.e. add a "retries" i2c_client > attribute.) > Still doesn't solve the permanent error case, though. Question remains, then, if it is likely that a single i2c register would return a permanent error while others still work. > I admit I have been ignoring the issue mainly so far, because it's not > a big problem in practice (except on one board with the w83l785ts > driver, thus the extra code in that driver), so adding complex or > invasive code to deal with it isn't too appealing. > I'll take that as a hint and won't make any changes to lm90 driver error handling. Guenter _______________________________________________ lm-sensors mailing list lm-sensors@xxxxxxxxxxxxxx http://lists.lm-sensors.org/mailman/listinfo/lm-sensors