Re: [PATCH v8 01/11] cpuidle/poll_state: poll via smp_cond_load_relaxed()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2024-10-15 at 13:04 +0100, Catalin Marinas wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On Wed, Sep 25, 2024 at 04:24:15PM -0700, Ankur Arora wrote:
> > diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
> > index 9b6d90a72601..fc1204426158 100644
> > --- a/drivers/cpuidle/poll_state.c
> > +++ b/drivers/cpuidle/poll_state.c
> > @@ -21,21 +21,20 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
> > 
> >       raw_local_irq_enable();
> >       if (!current_set_polling_and_test()) {
> > -             unsigned int loop_count = 0;
> >               u64 limit;
> > 
> >               limit = cpuidle_poll_time(drv, dev);
> > 
> >               while (!need_resched()) {
> > -                     cpu_relax();
> > -                     if (loop_count++ < POLL_IDLE_RELAX_COUNT)
> > -                             continue;
> > -
> > -                     loop_count = 0;
> > +                     unsigned int loop_count = 0;
> >                       if (local_clock_noinstr() - time_start > limit) {
> >                               dev->poll_time_limit = true;
> >                               break;
> >                       }
> > +
> > +                     smp_cond_load_relaxed(&current_thread_info()->flags,
> > +                                           VAL & _TIF_NEED_RESCHED ||
> > +                                           loop_count++ >= POLL_IDLE_RELAX_COUNT);
> 
> The above is not guaranteed to make progress if _TIF_NEED_RESCHED is
> never set. With the event stream enabled on arm64, the WFE will
> eventually be woken up, loop_count incremented and the condition would
> become true. However, the smp_cond_load_relaxed() semantics require that
> a different agent updates the variable being waited on, not the waiting
> CPU updating it itself. Also note that the event stream can be disabled
> on arm64 on the kernel command line.

Alternately could we condition arch_haltpoll_want() on
arch_timer_evtstrm_available(), like v7?

> 
> Does the code above break any other architecture? I'd say if you want
> something like this, better introduce a new smp_cond_load_timeout()
> API. The above looks like a hack that may only work on arm64 when the
> event stream is enabled.
> 
> A generic option is udelay() (on arm64 it would use WFE/WFET by
> default). Not sure how important it is for poll_idle() but the downside
> of udelay() that it won't be able to also poll need_resched() while
> waiting for the timeout. If this matters, you could instead make smaller
> udelay() calls. Yet another problem, I don't know how energy efficient
> udelay() is on x86 vs cpu_relax().
> 
> So maybe an smp_cond_load_timeout() would be better, implemented with
> cpu_relax() generically and the arm64 would use LDXR, WFE and rely on
> the event stream (or fall back to cpu_relax() if the event stream is
> disabled).
> 
> --
> Catalin





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux