Hi George! Am Montag, dem 25.03.2024 um 23:18 +0300 schrieb George Stark: > On 3/25/24 21:48, Harald Geyer wrote: > > Am Montag, dem 25.03.2024 um 19:54 +0300 schrieb George Stark: > > > Protocol parsing errors could happen due to several reasons like > > > noise > > > environment, heavy load on system etc. If to poll the sensor > > > frequently > > > and/or for a long period kernel log will become polluted with > > > error > > > messages if their log level is err (i.e. on by default). > > > > Yes, these error are often recoverable. (As are many other HW > > errors, > > that typically are logged. Eg USB bus resets due to EMI) > > > > [...] > > > > The idea is, that these messages help users understand issues with > > their HW (like too long cables, broken cables etc). But it is true, > > that they will slowly accumulate in many real world scenarios > > without > > anything being truly wrong. > > I agree with you that it's very convenient to just take a look to > dmesg > and see device connection problems at once. But unlike e.g. usb user > has > to actually start reading sensor to perform communication and read > errors will be propagated to the userspace and could be noticed \ > handled. Not really. The log lines contain additional information useful for understanding the problem with the setup. > Anyway I believe we should use uniform approach for read errors - > currently in the driver there're already dbg messages: > > "lost synchronisation at edge %d\n" > "invalid checksum\n" These errors are usually caused by EMI and there isn't much to do aside from trying again until we find a time window with less interference. They are not logged, because in some cases they might be very frequent and can be handled by the user space client programatically anyway. > I changed log level from err to dbg for the messages: > > "Only %d signal edges detected\n" This mostly indicates a problem with the setup. Long cable, dead sensor, high (interrupt) load etc. Its true that this can happen during normal operation. - Usually when the system takes too long to enter the irq handler. But the primary causes are: 1) Your wiring is broken. In this case, the message is immediately helpful and points you in the right direction. (Only if you understand the protocol though.) 2) Your sensor is dead or "crashed", which also warrants an error msg IMO. The "crashed" case is a bit special. Some chips seem to randomly stop working after a couple of hours and the only remedy is to power cycle them. This could be done automatically. - I have the sensor power supply pin on a GPIO and reset it from userspace in my setup. I tried to work on a version of the driver some years ago, that would optionally register with a regulator and manage sensor resets from within the kernel driver. If this was actually implemented, we could reduce the logging to cases, where the reset didn't solve the problem. I stopped working on this, because it would have required changes to the regulator framework, to be actually useful, and the regulator maintainers didn't seem to keen about them. However, if you want to pick this up in an effort to reduce unnecessary error conditions and messages, I certainly would be happy. > "Don't know how to decode data: %d %d %d %d\n" This would indicate a sensor, that uses the same protocol but an unsupported data format. This is a permanent error and therefor should be logged IMO. I guess, if you have a bad readout due to EMI but the checksum accidentally matches, then you might get this message too. But this should be a very rare case. > They all are from a single callback and say the same thing - > communication problem. Not really. See above. > If we make all those messages as errors it'd be great to have > mechanism > to disable them e.g. thru module parameter or somehow without > rebuilding > kernel. No. What you try to change is cosmetic at best. It certainly doesn't justify adding any complexity. Since Jonathan deferred to my judgment: As you can see, I did consider the trade-off between useful diagnostics and spamming the log carefully. So naturally I'm inclined to reject your proposal unless it solves an actual problem. Also people still mail me directly with bogus bug reports about the driver when really they have some issue with their setup. I fear, if we reduce diagnostics, it will increase that noise. So I reject your proposed changes, if they are for the sake of unification. I'm willing to discuss, what the most sensible trade-off is, but it would need to actually add to the considerations I already did. Best regards, Harald > Those errors can be bypassed by increasing read rate. > > > > > I don't consider the dmesg buffer being rotated after a month or > > two a > > bug. But I suppose this is a corner case. I'll happily accept > > whatever > > Jonathan thinks is reasonable. > > > > Best regards, > > Harald > > > > > > > Signed-off-by: George Stark <gnstark@xxxxxxxxxxxxxxxxx> > > > --- > > > I use DHT22 sensor with Raspberry Pi Zero W as a simple home > > > meteo > > > station. > > > Even if to poll the sensor once per tens of seconds after month > > > or > > > two dmesg > > > may become full of useless parsing error messages. Anyway those > > > errors are caught > > > in the user software thru return values. > > > > > > drivers/iio/humidity/dht11.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/iio/humidity/dht11.c > > > b/drivers/iio/humidity/dht11.c > > > index c97e25448772..e2cbc442177b 100644 > > > --- a/drivers/iio/humidity/dht11.c > > > +++ b/drivers/iio/humidity/dht11.c > > > @@ -156,7 +156,7 @@ static int dht11_decode(struct dht11 *dht11, > > > int > > > offset) > > > dht11->temperature = temp_int * 1000; > > > dht11->humidity = hum_int * 1000; > > > } else { > > > - dev_err(dht11->dev, > > > + dev_dbg(dht11->dev, > > > "Don't know how to decode data: %d %d %d > > > %d\n", > > > hum_int, hum_dec, temp_int, temp_dec); > > > return -EIO; > > > @@ -239,7 +239,7 @@ static int dht11_read_raw(struct iio_dev > > > *iio_dev, > > > #endif > > > > > > if (ret == 0 && dht11->num_edges < > > > DHT11_EDGES_PER_READ - 1) { > > > - dev_err(dht11->dev, "Only %d signal edges > > > detected\n", > > > + dev_dbg(dht11->dev, "Only %d signal edges > > > detected\n", > > > dht11->num_edges); > > > ret = -ETIMEDOUT; > > > } > > > -- > > > 2.25.1 > > > > > >