Catalin Marinas <catalin.marinas@xxxxxxx> writes: > On Fri, Nov 08, 2024 at 11:41:08AM -0800, Christoph Lameter (Ampere) wrote: >> On Thu, 7 Nov 2024, Ankur Arora wrote: >> > > Calling the clock retrieval function repeatedly should be fine and is >> > > typically done in user space as well as in kernel space for functions that >> > > need to wait short time periods. >> > >> > The problem is that you might have multiple CPUs polling in idle >> > for prolonged periods of time. And, so you want to minimize >> > your power/thermal envelope. >> >> On ARM that maps to YIELD which does not do anything for the power >> envelope AFAICT. It switches to the other hyperthread. > > The issue is not necessarily arm64 but poll_idle() on other > architectures like x86 where, at the end of this series, they still call > cpu_relax() in a loop and check local_clock() every 200 times or so > iterations. So I wouldn't want to revert the improvement in 4dc2375c1a4e > ("cpuidle: poll_state: Avoid invoking local_clock() too often"). > > I agree that the 200 iterations here it's pretty random and it was > something made up for poll_idle() specifically and it could increase the > wait period in other situations (or other architectures). > > OTOH, I'm not sure we want to make this API too complex if the only > user for a while would be poll_idle(). We could add a comment that the > timeout granularity can be pretty coarse and architecture dependent (200 > cpu_relax() calls in one deployment, 100us on arm64 with WFE). Yeah, agreed. Not worth over engineering this interface at least not until there are other users. For now I'll just add a comment mentioning that the time-check is only coarse grained and architecture dependent. -- ankur