On 11/21/2024 07:15, Rafael J. Wysocki wrote:
From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
As stated by Len in [1], the extra delay added by msleep() to the
sleep time value passed to it can be significant, roughly between
1.5 ns on systems with HZ = 1000 and as much as 15 ms on systems with
HZ = 100, which is hardly acceptable, at least for small sleep time
values.
Address this by using usleep_range() in acpi_os_sleep() instead of
msleep(). For short sleep times this is a no-brainer, but even for
long sleeps usleep_range() should be preferred because timer wheel
timers are optimized for cancellation before they expire and this
particular timer is not going to be canceled.
Add at least 50 us on top of the requested sleep time in case the
timer can be subject to coalescing, which is consistent with what's
done in user space in this context [2], but for sleeps longer than 5 ms
use 1% of the requested sleep time for this purpose.
The rationale here is that longer sleeps don't need that much of a timer
precision as a rule and making the timer a more likely candidate for
coalescing in these cases is generally desirable. It starts at 5 ms so
that the delta between the requested sleep time and the effective
deadline is a contiuous function of the former.
Link: https://lore.kernel.org/linux-pm/c7db7e804c453629c116d508558eaf46477a2d73.1731708405.git.len.brown@xxxxxxxxx/ [1]
Link: https://lore.kernel.org/linux-pm/CAJvTdK=Q1kwWA6Wxn8Zcf0OicDEk6cHYFAvQVizgA47mXu63+g@xxxxxxxxxxxxxx/ [2]
Reported-by: Len Brown <lenb@xxxxxxxxxx>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
You probably should also pick up this tag from the earlier version.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=216263
---
This is a follow-up to the discussion started by [1] above and since
the beginning of it I have changed my mind a bit, as you can see.
Given Arjan's feedback, I've concluded that using usleep_range() for
all sleep values is the right choice and that some slack should be
used there. I've taken 50 us as the minimum value of it because that's
what is used in user space FWICT and I'm not convinced that shorter
values would be suitable here.
The other part, using 1% of the sleep time as the slack for longer
sleeps, is likely more controversial. It is roughly based on the
observation that if one timer interrupt is sufficient for something,
then using two of them will be wasteful even if this is just somewhat.
Anyway, please let me know what you think. I'd rather do whatever
the majority of you are comfortable with.
Generally I'm fine with this.
I'm about to head on US holiday, but I will forward this to folks that
aren't and get some testing input on it to bring back later when I'm back.
---
drivers/acpi/osl.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
Index: linux-pm/drivers/acpi/osl.c
===================================================================
--- linux-pm.orig/drivers/acpi/osl.c
+++ linux-pm/drivers/acpi/osl.c
@@ -607,7 +607,27 @@ acpi_status acpi_os_remove_interrupt_han
void acpi_os_sleep(u64 ms)
{
- msleep(ms);
+ u64 usec = ms * USEC_PER_MSEC, delta_us = 50;
+
+ /*
+ * Use a hrtimer because the timer wheel timers are optimized for
+ * cancellation before they expire and this timer is not going to be
+ * canceled.
+ *
+ * Set the delta between the requested sleep time and the effective
+ * deadline to at least 50 us in case there is an opportunity for timer
+ * coalescing.
+ *
+ * Moreover, longer sleeps can be assumed to need somewhat less timer
+ * precision, so sacrifice some of it for making the timer a more likely
+ * candidate for coalescing by setting the delta to 1% of the sleep time
+ * if it is above 5 ms (this value is chosen so that the delta is a
+ * continuous function of the sleep time).
+ */
+ if (ms > 5)
+ delta_us = (USEC_PER_MSEC / 100) * ms;
+
+ usleep_range(usec, usec + delta_us);
}
void acpi_os_stall(u32 us)