Christoph Lameter (Ampere) <cl@xxxxxxxxxx> writes: > On Thu, 7 Nov 2024, Ankur Arora wrote: > >> > Calling the clock retrieval function repeatedly should be fine and is >> > typically done in user space as well as in kernel space for functions that >> > need to wait short time periods. >> >> The problem is that you might have multiple CPUs polling in idle >> for prolonged periods of time. And, so you want to minimize >> your power/thermal envelope. > > On ARM that maps to YIELD which does not do anything for the power > envelope AFAICT. It switches to the other hyperthread. Agreed. For arm64 patch-5 adds a specialized version. For the fallback case when we don't have an event stream, the arm64 version does use the same cpu_relax() loop but that's not a production thing. >> For instance see commit 4dc2375c1a4e "cpuidle: poll_state: Avoid >> invoking local_clock() too often" which originally added a similar >> rate limit to poll_idle() where they saw exactly that issue. > > Looping w/o calling local_clock may increase the wait period etc. Yeah. I don't think that's a real problem for the poll_idle() case as the only thing waiting on the other side of the possibly delayed timer is a deeper idle state. But, for any other potential users the looping duration might be too long (the generated code for x86 will execute around 200 * 7 instructions before checking the timer, so a worst case delay of say around 1-2us.) I'll note that in the comment around smp_cond_time_check_count just to warn any future users. > For power saving most arches have special instructions like ARMS > WFE/WFET. These are then causing more accurate wait times than the looping > thing? Definitely true for WFET. The WFE can still overshoot because the eventstream has a period of 100us. -- ankur