Re: [patch 0/3] cpuidle-haltpoll driver (v2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 10, 2019 at 5:00 PM Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:
>
> On Fri, Jun 07, 2019 at 11:49:51AM +0200, Rafael J. Wysocki wrote:
> > On 6/4/2019 12:52 AM, Marcelo Tosatti wrote:
> > >The cpuidle-haltpoll driver allows the guest vcpus to poll for a specified
> > >amount of time before halting. This provides the following benefits
> > >to host side polling:
> > >
> > >         1) The POLL flag is set while polling is performed, which allows
> > >            a remote vCPU to avoid sending an IPI (and the associated
> > >            cost of handling the IPI) when performing a wakeup.
> > >
> > >         2) The HLT VM-exit cost can be avoided.
> > >
> > >The downside of guest side polling is that polling is performed
> > >even with other runnable tasks in the host.
> > >
> > >Results comparing halt_poll_ns and server/client application
> > >where a small packet is ping-ponged:
> > >
> > >host                                        --> 31.33
> > >halt_poll_ns=300000 / no guest busy spin    --> 33.40   (93.8%)
> > >halt_poll_ns=0 / guest_halt_poll_ns=300000  --> 32.73   (95.7%)
> > >
> > >For the SAP HANA benchmarks (where idle_spin is a parameter
> > >of the previous version of the patch, results should be the
> > >same):
> > >
> > >hpns == halt_poll_ns
> > >
> > >                           idle_spin=0/   idle_spin=800/    idle_spin=0/
> > >                           hpns=200000    hpns=0            hpns=800000
> > >DeleteC06T03 (100 thread) 1.76           1.71 (-3%)        1.78   (+1%)
> > >InsertC16T02 (100 thread) 2.14           2.07 (-3%)        2.18   (+1.8%)
> > >DeleteC00T01 (1 thread)   1.34           1.28 (-4.5%)           1.29   (-3.7%)
> > >UpdateC00T03 (1 thread)   4.72           4.18 (-12%)    4.53   (-5%)
> > >
> > >V2:
> > >
> > >- Move from x86 to generic code (Paolo/Christian).
> > >- Add auto-tuning logic (Paolo).
> > >- Add MSR to disable host side polling (Paolo).
> > >
> > >
> > >
> > First of all, please CC power management patches (including cpuidle,
> > cpufreq etc) to linux-pm@xxxxxxxxxxxxxxx (there are people on that
> > list who may want to see your changes before they go in) and CC
> > cpuidle material (in particular) to Peter Zijlstra.
> >
> > Second, I'm not a big fan of this approach to be honest, as it kind
> > of is a driver trying to play the role of a governor.
> >
> > We have a "polling state" already that could be used here in
> > principle so I wonder what would be wrong with that.  Also note that
> > there seems to be at least some code duplication between your code
> > and the "polling state" implementation, so maybe it would be
> > possible to do some things in a common way?
>
> Hi Rafael,
>
> After modifying poll_state.c to use a generic "poll time" driver
> callback [1] (since using a variable "target_residency" for that
> looks really ugly), would need a governor which does:
>
> haltpoll_governor_select_next_state()
>         if (prev_state was poll and evt happened on prev poll window) -> POLL.
>         if (prev_state == HLT)  -> POLL
>         otherwise               -> HLT
>
> And a "default_idle" cpuidle driver that:
>
> defaultidle_idle()
>         if (current_clr_polling_and_test()) {
>                 local_irq_enable();
>                 return index;
>         }
>         default_idle();
>         return
>
> Using such governor with any other cpuidle driver would
> be pointless (since it would enter the first state only
> and therefore not save power).
>
> Not certain about using the default_idle driver with
> other governors: one would rather use a driver that
> supports all states on a given machine.
>
> This combination of governor/driver pair, for the sake
> of sharing the idle loop, seems awkward to me.
> And fails the governor/driver separation: one will use the
> pair in practice.
>
> But i have no problem with it, so i'll proceed with that.
>
> Let me know otherwise.

If my understanding of your argumentation is correct, it is only
necessary to take the default_idle_call() branch of
cpuidle_idle_call() in the VM case, so it should be sufficient to
provide a suitable default_idle_call() which is what you seem to be
trying to do.

I might have been confused by the terminology used in the patch series
if that's the case.

Also, if that's the case, this is not cpuidle matter really.  It is a
matter of providing a better default_idle_call() for the arch at hand.

Thanks,
Rafael



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux