On 3/12/23 16:43, Jonathan Cameron wrote:
On Mon, 06 Mar 2023 22:15:15 +0100
"Andres Heinloo" <andres@xxxxxxxxxxxxxx> wrote:
On Sun, 05 Mar 2023 03:05:01 +0100
"Andres Heinloo" <andres@xxxxxxxxxxxxxx> wrote:
On Sat, 4 Mar 2023 17:06:20 +0000
Jonathan Cameron <jic23@xxxxxxxxxx> wrote:
On Fri, 03 Mar 2023 12:10:00 +0100
"Andres Heinloo" <andres@xxxxxxxxxxxxxx> wrote:
Hello,
I've been struggling with the dps310 driver, which gives incorrect
pressure values and in particular different values than manufacturers
code (https://github.com/Infineon/RaspberryPi_DPS).
I think I've found where the problem is. Firstly, there is a mistake
in bit numbering at
https://github.com/torvalds/linux/blob/857f1268a591147f7be7509f249dbb3aba6fc65c/drivers/iio/pressure/dps310.c#L51
According to datasheet, correct is:
#define DPS310_INT_HL BIT(7)
#define DPS310_TMP_SHIFT_EN BIT(3)
#define DPS310_PRS_SHIFT_EN BIT(2)
#define DPS310_FIFO_EN BIT(1)
#define DPS310_SPI_EN BIT(0)
Eg., the current code is using wrong bit (4) for
DPS310_PRS_SHIFT_EN, which means that pressure shift is never
enabled.
Checking the datasheet, seems like you are right.
https://www.infineon.com/dgdl/Infineon-DPS310-DataSheet-v01_02-EN.pdf?fileId=5546d462576f34750157750826c42242
Section 7:
Though that's not the only bit that is wrong. Looks like FIFO
enable is as well.
So any fix should deal with that as well.
Yes, DPS310_PRS_SHIFT_EN, DPS310_FIFO_EN, DPS310_SPI_EN are all
wrong, but the latter 2 are not used by the driver.
The differences between the register map and the datasheet I'm
looking at make
me think that perhaps the driver was developed against a prototype
part.
The registers are in a different order for starters with the B0, B1
and B2
sets in reverse order. Any fix patch should tidy that up as well.
Yes, but that's just different naming. MSB is called B2 in the
datasheet and B0 in the driver.
Secondly, there is a problem with overflows starting at
https://github.com/torvalds/linux/blob/857f1268a591147f7be7509f249dbb3aba6fc65c/drivers/iio/pressure/dps310.c#L654
Since p is a 24-bit value,
nums[3] = p * p * p * (s64)data->c30;
can and does overflow.
Makes sense, though I can't immediately see a good solution as we
need
to maintain the remainder part.
I don't have a good solution either, but there must be other IIO
sensors that have something similar that could be possibly reused.
Second overflow problem is at
https://github.com/torvalds/linux/blob/857f1268a591147f7be7509f249dbb3aba6fc65c/drivers/iio/pressure/dps310.c#L684
In fact, I don't understand why 1000000000LL is needed. Since only 7
values are summed, using 10LL should give the same precision.
Whilst the existing value seems large - I'm not great with
precision calcs so could
you lay out why 10LL is sufficient?
Unless I overlooked something, the error of integer division (eg.,
discarding fractional part) is <1. In this case, the results of 7
integer divisions are summed, so the error is <7. When multiplying
numerators by 10LL, the error would be <0.7. Which is OK, since we
are interested only in the integer part.
Unfortunately it seems that there may be more problems with the
driver. I've noticed that my device has lost sample rate and precision
settings few times. When looking at the code, it seems the settings
are not restored when dps310_reset_reinit() is called at
https://github.com/torvalds/linux/blob/8ca09d5fa3549d142c2080a72a4c70ce389163cd/drivers/iio/pressure/dps310.c#L438
Agreed. Looks like that stuff is missing. This reset path is pretty horrible
in general, so I'm not surprised a few things got missed. Should be easy enough
to add a cache of those and and set them again if you want to try that.
One option would be adding sample_rate and oversampling_ratio to
dps310_data struct, but some questions remain.
I think doing a silent reset is not good, because it can cause bad
samples. When looking at data, you don't know if there was an actual
pressure spike or if it was caused by a reset. I think a better solution
would be immediately returning an error (-EIO?) to inform user space
about an intermittent failure.
(I suspect that in case of some errors, the IIO subsystem retries
internally and hides the error.)
The purpose of timeout_recovery_failed seems questionable, because if
you end up with timeout_recovery_failed == true, the only way to recover
from timeout is reloading the kernel module.
I'm afraid I'm not able to provide a patch to fix the problems soon.
Apparently I've also ended up with data->timeout_recovery_failed ==
true at
https://github.com/torvalds/linux/blob/8ca09d5fa3549d142c2080a72a4c70ce389163cd/drivers/iio/pressure/dps310.c#L443,
but the device worked fine after just reloading the kernel module.
This is difficult to reproduce, though.
Hmm. There are a few things that could cause that. Maybe something running slow, or
an intermittent write failure (noisy environment or bad wire / track maybe?)
(Off-topic) Write failures is a different problem that is very annoying.
I'm not sure if this could be the case with dps310, but I have an
ADXL355 device that is connected via SPI and often register writes fail
silently. I have a short cable that is well connected and SPI should be
running at max 1Mhz according to .../of_node/spi-max-frequency. I ended
up wrapping regmap functions to check register value after each write. I
haven't noticed errors when reading registers. (I'm using Raspberry Pi.)
Andres