Re: [PATCH v1] wifi: mac80211: Initialize EWMA fail avg to 1

Benjamin Beichler <Benjamin.Beichler@xxxxxxxxxxxxxx> · Fri, 21 Apr 2023 12:23:12 +0200

Am 20.04.2023 um 15:15 schrieb Felix Fietkau:
On 20. Apr 2023, at 15:00, Benjamin Beichler<Benjamin.Beichler@xxxxxxxxxxxxxx>  wrote:

Am 20.04.2023 um 14:22 schrieb Felix Fietkau:
On 20.04.23 12:27, Johannes Berg wrote:
On Thu, 2023-04-20 at 11:30 +0200, Benjamin Beichler wrote:
To me, basically, I see two ways to solve this:

1) we have DECLARE_EWMA_ZERO_VALID() or something like that which
     *doesn't* treat 0 as an uninitialized value, and either has a
     separate "not initialized yet" bit (but that's iffy storage wise),
     or simply has another argument to _init() for the initial value or
     so.

2) you don't just don't use 0 and 100 but say 1 and 100, that results in
     basically the same behaviour, but avoids the special 0.

johannes
I also ran into that problem in the past, and reviewing it again with a
college, I think, this is a real bug in the EWMA implementation. I try
to provide a proper patch in the next days, but actually the EWMA
handles the internal value zero, always like in the initialization,
which is wrong, e.g., for positive/negative averaged values.
Yes, it's always wrong as long as you feed it something zero, or values
with different sign.

For a lot of use cases, however, that doesn't matter. Originally, it was
used e.g. for signal strength averaging, average packet lengths, etc.
where it really doesn't matter since you can never use 0 or values that
have different sign.

A quick research shows, this bug is since the first implementation of
the ewma in the code ...

Yeah, I'm aware of that, I was around for it ;-)

But see above, I'm not sure I'd even call it a bug, at least not
originally with the users that we had intended.

Hence I don't know if it's really good to fix this in general - for many
of these cases zero can still be treated specially (and like I mentioned
in my previous email, we can even here avoid 0), and then we don't spend
an extra byte (or likely 4) to hold a "first time" flag.

Dunno. Maybe it's not worth thinking about the extra memory space vs.
the extra maintenance cost. But maybe at least on 64-bit we could steal
a bit from the unsigned long? Not sure what all the users are..
We don't actually need a full bit. We can just add 1 to the internal
value for initialized values. How about this (completely untested):
https://nbd.name/p/69b00c5b

Nice hack, but even a bit more dirty than before :-D
I disagree. With my “hack” at least the implementation will do the right thing without the API user having to worry about 0 as a value. It could use some comments to clarify the details, but I do think it makes things easier.

I would like to cite Johannes here:

I mean, I guess I can see where this patch makes some sense like from a
code point of view, this is sort of the minimal code change you could
make to make the existing code work, but ... I'd argue you're optimising
to the wrong metric here, "minimal code changes to fix the bug" should
normally not be your metric, it should be "code changes that make this
clearer and avoid the problem", or something like that?
I agree that you solve the problem, but it is not clearer. And as a user I expect to have a classical EWMA, which I initialize with a proper value and then use it with normal inputs (where I include zero and negative numbers).
From my understanding, you shifted the problem only to one digit before zero, which may even not solve it for the mesh case here (but maybe I did not catch it right).

I may also argue, that you introduce 2 additions into the hot path to avoid a separate init function, which is mostly called once. And it even introduces non-intuitive behavior in corner cases.
However, even the mostly unused branch on the value of internal for initialization is suboptimal for the hot path and would be an indicator for a separate init function.

From my dig into the usages, most do not care for initialization or do it explicitly with bogus add calls, which can be simply replaced.

Benjamin