From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> As stated by Len in [1], the extra delay added by msleep() to the sleep time value passed to it can be significant, roughly between 1.5 ns on systems with HZ = 1000 and as much as 15 ms on systems with HZ = 100, which is hardly acceptable, at least for small sleep time values. Address this by using usleep_range() in acpi_os_sleep() instead of msleep(). For short sleep times this is a no-brainer, but even for long sleeps usleep_range() should be preferred because timer wheel timers are optimized for cancellation before they expire and this particular timer is not going to be canceled. Add at least 50 us on top of the requested sleep time in case the timer can be subject to coalescing, which is consistent with what's done in user space in this context [2], but for sleeps longer than 5 ms use 1% of the requested sleep time for this purpose. The rationale here is that longer sleeps don't need that much of a timer precision as a rule and making the timer a more likely candidate for coalescing in these cases is generally desirable. It starts at 5 ms so that the delta between the requested sleep time and the effective deadline is a contiuous function of the former. Link: https://lore.kernel.org/linux-pm/c7db7e804c453629c116d508558eaf46477a2d73.1731708405.git.len.brown@xxxxxxxxx/ [1] Link: https://lore.kernel.org/linux-pm/CAJvTdK=Q1kwWA6Wxn8Zcf0OicDEk6cHYFAvQVizgA47mXu63+g@xxxxxxxxxxxxxx/ [2] Reported-by: Len Brown <lenb@xxxxxxxxxx> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> --- This is a follow-up to the discussion started by [1] above and since the beginning of it I have changed my mind a bit, as you can see. Given Arjan's feedback, I've concluded that using usleep_range() for all sleep values is the right choice and that some slack should be used there. I've taken 50 us as the minimum value of it because that's what is used in user space FWICT and I'm not convinced that shorter values would be suitable here. The other part, using 1% of the sleep time as the slack for longer sleeps, is likely more controversial. It is roughly based on the observation that if one timer interrupt is sufficient for something, then using two of them will be wasteful even if this is just somewhat. Anyway, please let me know what you think. I'd rather do whatever the majority of you are comfortable with. --- drivers/acpi/osl.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) Index: linux-pm/drivers/acpi/osl.c =================================================================== --- linux-pm.orig/drivers/acpi/osl.c +++ linux-pm/drivers/acpi/osl.c @@ -607,7 +607,27 @@ acpi_status acpi_os_remove_interrupt_han void acpi_os_sleep(u64 ms) { - msleep(ms); + u64 usec = ms * USEC_PER_MSEC, delta_us = 50; + + /* + * Use a hrtimer because the timer wheel timers are optimized for + * cancellation before they expire and this timer is not going to be + * canceled. + * + * Set the delta between the requested sleep time and the effective + * deadline to at least 50 us in case there is an opportunity for timer + * coalescing. + * + * Moreover, longer sleeps can be assumed to need somewhat less timer + * precision, so sacrifice some of it for making the timer a more likely + * candidate for coalescing by setting the delta to 1% of the sleep time + * if it is above 5 ms (this value is chosen so that the delta is a + * continuous function of the sleep time). + */ + if (ms > 5) + delta_us = (USEC_PER_MSEC / 100) * ms; + + usleep_range(usec, usec + delta_us); } void acpi_os_stall(u32 us)