Hi Ulf, On Fri, May 12, 2023 at 9:54 AM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > On Thu, 11 May 2023 at 14:44, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > On Thu, May 11, 2023 at 12:27 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > > > On Wed, 10 May 2023 at 15:23, Geert Uytterhoeven > > > <geert+renesas@xxxxxxxxx> wrote: > > > > read_poll_timeout_atomic() uses ktime_get() to implement the timeout > > > > feature, just like its non-atomic counterpart. However, there are > > > > several issues with this, due to its use in atomic contexts: > > > > > > > > 1. When called in the s2ram path (as typically done by clock or PM > > > > domain drivers), timekeeping may be suspended, triggering the > > > > WARN_ON(timekeeping_suspended) in ktime_get(): > > > > > > > > WARNING: CPU: 0 PID: 654 at kernel/time/timekeeping.c:843 ktime_get+0x28/0x78 > > > > > > > > Calling ktime_get_mono_fast_ns() instead of ktime_get() would get > > > > rid of that warning. However, that would break timeout handling, > > > > as (at least on systems with an ARM architectured timer), the time > > > > returned by ktime_get_mono_fast_ns() does not advance while > > > > timekeeping is suspended. > > > > Interestingly, (on the same ARM systems) the time returned by > > > > ktime_get() does advance while timekeeping is suspended, despite > > > > the warning. > > > > > > Interesting, looks like we should spend some time to further > > > investigate this behaviour. > > > > Probably, I was a bit surprised by this behavior, too. > > > > > > 2. Depending on the actual clock source, and especially before a > > > > high-resolution clocksource (e.g. the ARM architectured timer) > > > > becomes available, time may not advance in atomic contexts, thus > > > > breaking timeout handling. > > > > > > > > Fix this by abandoning the idea that one can rely on timekeeping to > > > > implement timeout handling in all atomic contexts, and switch from a > > > > global time-based to a locally-estimated timeout handling. In most > > > > (all?) cases the timeout condition is exceptional and an error > > > > condition, hence any additional delays due to underestimating wall clock > > > > time are irrelevant. > > > > > > I wonder if this isn't an oversimplification of the situation. Don't > > > we have timeout-error-conditions that we expected to happen quite > > > frequently? > > > > We may have some. But they definitely do not happen when time > > does not advance, or they would have been mitigated long ago > > (the loop would never terminate). > > Right, I was merely thinking of the case when ktime isn't suspended, > which of course is the most common case. > > > > > > If so, in these cases, we really don't want to continue looping longer > > > than actually needed, as then we will remain in the atomic context > > > longer than necessary. > > > > > > I guess some information about how big these additional delays could > > > be, would help to understand better. Of course, it's not entirely easy > > > to get that data, but did you run some tests to see how this changes? > > > > I did some timings (when timekeeping is available), and the differences > > are rather minor. The delay and timeout parameters are in µs, and > > 1 µs is already a few orders of magnitude larger than the cycle time > > of a contemporary CPU. > > Ohh, I was certainly expecting a bigger spread. If it's in that > ballpark we should certainly be fine. > > I will run some tests at my side too, as I am curious to see the > behaviour. I will let you know, whatever the result is, of course. > > > > > Under-estimates are due to the time spent in op() (depends on the > > user, typical use is a hardware device register read), udelay() > > (architecture/platform-dependent accuracy), and general loop overhead. > > Yes, you are right. My main concern is the accuracy of the udelay, but > I may be totally wrong here. > > > > > > > Signed-off-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx> > > > > --- > > > > Alternatively, one could use a mixed approach (use both > > > > ktime_get_mono_fast_ns() and a local (under)estimate, and timeout on the > > > > earliest occasion), but I think that would complicate things without > > > > much gain. > > > > > > Another option could be to provide two different polling APIs for the > > > atomic use-case. > > > > > > One that keeps using ktime, which is more accurate and generally > > > favourable - and another, along the lines of what you propose, that > > > should be used by those that can't rely on timekeeping. > > > > At the risk of people picking the wrong one, leading to hard to > > find bugs? > > I agree, If we don't need two APIs, it's certainly better to stick with one. > > My main point is that we should not sacrifice "performance" for the > most common case, just to keep things simple, right? Most of these loops run just 1 or 2 cycles. Performance mostly kicks in when timing out, but note that not calling ktime_get() also reduces loop overhead... Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds