On Fri, Nov 08, 2024 at 11:41:08AM -0800, Christoph Lameter (Ampere) wrote: > On Thu, 7 Nov 2024, Ankur Arora wrote: > > > Calling the clock retrieval function repeatedly should be fine and is > > > typically done in user space as well as in kernel space for functions that > > > need to wait short time periods. > > > > The problem is that you might have multiple CPUs polling in idle > > for prolonged periods of time. And, so you want to minimize > > your power/thermal envelope. > > On ARM that maps to YIELD which does not do anything for the power > envelope AFAICT. It switches to the other hyperthread. The issue is not necessarily arm64 but poll_idle() on other architectures like x86 where, at the end of this series, they still call cpu_relax() in a loop and check local_clock() every 200 times or so iterations. So I wouldn't want to revert the improvement in 4dc2375c1a4e ("cpuidle: poll_state: Avoid invoking local_clock() too often"). I agree that the 200 iterations here it's pretty random and it was something made up for poll_idle() specifically and it could increase the wait period in other situations (or other architectures). OTOH, I'm not sure we want to make this API too complex if the only user for a while would be poll_idle(). We could add a comment that the timeout granularity can be pretty coarse and architecture dependent (200 cpu_relax() calls in one deployment, 100us on arm64 with WFE). -- Catalin