Am 07.02.23 um 11:33 schrieb harald@xxxxxxxxx:
Thanks, these are indeed interresting results. If you want to move this forward, the next steps would be: 1) Sharing your test script - yes it's trivial, but still ...
That appears fairly easy, you find the script here: https://github.com/pelzvieh/banana_resources/tree/main/test_dht11
2) A theoretical analysis about possible regressions depending on timer resolution as mentioned in an earlier message.
This sounds as if you were doing such an analysis for the original version. Can you share this work so I can attempt to repeat it for the modified algorithm?
3) Ideally figuring out, why your version performs better then what we currently have. I have some suspicions, but better understanding might lead to a better approach. E.g. maybe recording the other edges isn't the problem so long as we ignore them during decoding? As I see it, the main thing we are losing with your current proposal is some diagnostic features. If we keep them as much as possible and have regressions understood and covered, I see no reason to reject your idea.
That's why I changed the script to separately count EIO and ETIMEDOUT. The latter indicates missed edges, the former failure to interpret the data read. What I see is that the patched driver's errors mostly result from missed IRQ (note in contrast to last results, I cut the number of reads): # real[s] user[s] sys[s] success EIO timeout err per succ 1 20.57 0.25 0.03 10 0 0 0 2 24.74 0.25 0.07 10 0 4 0,4 3 21.55 0.20 0.07 10 0 0 0 4 25.81 0.25 0.08 10 0 5 0,5 5 21.56 0.23 0.05 10 0 0 0 6 21.58 0.22 0.05 10 1 0 0,1 7 25.86 0.24 0.08 10 1 5 0,6 8 22.69 0.27 0.05 10 1 1 0,2 9 23.67 0.26 0.04 10 0 2 0,2 10 20.55 0.23 0.04 10 0 0 0 Whereas the original driver has more errors resulting from mis-interpreted data: # real[s] user[s] sys[s] success EIO timeout err per succ 1 24.88 0.26 0.07 10 5 4 0,9 2 25.91 0.26 0.07 10 4 5 0,9 3 31.27 0.31 0.10 10 6 10 1,6 4 29.17 0.32 0.11 10 7 8 1,5 5 22.73 0.24 0.08 10 4 2 0,6 6 46.46 0.35 0.25 10 19 24 4,3 7 23.79 0.23 0.09 10 3 3 0,6 8 30.17 0.27 0.11 10 6 9 1,5 9 23.77 0.26 0.06 10 3 2 0,5 10 20.58 0.24 0.06 10 1 0 0,1 I tried a variant that reads falling and rising edges and uses the redundany of information to eliminate some errors. This did not work out at all. It seems a relevant source of trouble is delayed call to the IRQ handler. The problem is that only then you try to find out if this IRQ is due to rising or falling edge by reading the current GPIO level. When you are to late, this might already have changed and you read a level, but for the edge of _this_ level you'll receive another IRQ a few us later. So the reason that this patch here is showing lower error rates seems to be the lower probability of such things happening by halving the IRQs to be handled, _plus_ the information from the hardware, that this IRQ was due to a falling edge. Yours, Andreas.