From: Mihai Carabas <mihai.carabas@xxxxxxxxxx> The inner loop in poll_idle() polls up to POLL_IDLE_RELAX_COUNT times, checking to see if the thread has the TIF_NEED_RESCHED bit set. The loop exits once the condition is met, or if the poll time limit has been exceeded. To minimize the number of instructions executed each iteration, the time check is done only infrequently (once every POLL_IDLE_RELAX_COUNT iterations). In addition, each loop iteration executes cpu_relax() which on certain platforms provides a hint to the pipeline that the loop is busy-waiting, thus allowing the processor to reduce power consumption. However, cpu_relax() is defined optimally only on x86. On arm64, for instance, it is implemented as a YIELD which only serves a hint to the CPU that it prioritize a different hardware thread if one is available. arm64, however, does expose a more optimal polling mechanism via smp_cond_load_relaxed() which uses LDXR, WFE to wait until a store to a specified region. So restructure the loop, folding both checks in smp_cond_load_relaxed(). Also, move the time check to the head of the loop allowing it to exit straight-away once TIF_NEED_RESCHED is set. Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Signed-off-by: Mihai Carabas <mihai.carabas@xxxxxxxxxx> Reviewed-by: Christoph Lameter <cl@xxxxxxxxx> Reviewed-by: Misono Tomohiro <misono.tomohiro@xxxxxxxxxxx> Signed-off-by: Ankur Arora <ankur.a.arora@xxxxxxxxxx> --- drivers/cpuidle/poll_state.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c index 9b6d90a72601..fc1204426158 100644 --- a/drivers/cpuidle/poll_state.c +++ b/drivers/cpuidle/poll_state.c @@ -21,21 +21,20 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev, raw_local_irq_enable(); if (!current_set_polling_and_test()) { - unsigned int loop_count = 0; u64 limit; limit = cpuidle_poll_time(drv, dev); while (!need_resched()) { - cpu_relax(); - if (loop_count++ < POLL_IDLE_RELAX_COUNT) - continue; - - loop_count = 0; + unsigned int loop_count = 0; if (local_clock_noinstr() - time_start > limit) { dev->poll_time_limit = true; break; } + + smp_cond_load_relaxed(¤t_thread_info()->flags, + VAL & _TIF_NEED_RESCHED || + loop_count++ >= POLL_IDLE_RELAX_COUNT); } } raw_local_irq_disable(); -- 2.43.5