Re: [patch 0/3] cpuidle-haltpoll driver (v2)

Marcelo Tosatti <mtosatti@xxxxxxxxxx> · Fri, 7 Jun 2019 14:16:47 -0300

On Fri, Jun 07, 2019 at 11:49:51AM +0200, Rafael J. Wysocki wrote:
> On 6/4/2019 12:52 AM, Marcelo Tosatti wrote:
> >The cpuidle-haltpoll driver allows the guest vcpus to poll for a specified
> >amount of time before halting. This provides the following benefits
> >to host side polling:
> >
> >         1) The POLL flag is set while polling is performed, which allows
> >            a remote vCPU to avoid sending an IPI (and the associated
> >            cost of handling the IPI) when performing a wakeup.
> >
> >         2) The HLT VM-exit cost can be avoided.
> >
> >The downside of guest side polling is that polling is performed
> >even with other runnable tasks in the host.
> >
> >Results comparing halt_poll_ns and server/client application
> >where a small packet is ping-ponged:
> >
> >host                                        --> 31.33
> >halt_poll_ns=300000 / no guest busy spin    --> 33.40   (93.8%)
> >halt_poll_ns=0 / guest_halt_poll_ns=300000  --> 32.73   (95.7%)
> >
> >For the SAP HANA benchmarks (where idle_spin is a parameter
> >of the previous version of the patch, results should be the
> >same):
> >
> >hpns == halt_poll_ns
> >
> >                           idle_spin=0/   idle_spin=800/    idle_spin=0/
> >                           hpns=200000    hpns=0            hpns=800000
> >DeleteC06T03 (100 thread) 1.76           1.71 (-3%)        1.78   (+1%)
> >InsertC16T02 (100 thread) 2.14           2.07 (-3%)        2.18   (+1.8%)
> >DeleteC00T01 (1 thread)   1.34           1.28 (-4.5%)	   1.29   (-3.7%)
> >UpdateC00T03 (1 thread)   4.72           4.18 (-12%)	   4.53   (-5%)
> >
> >V2:
> >
> >- Move from x86 to generic code (Paolo/Christian).
> >- Add auto-tuning logic (Paolo).
> >- Add MSR to disable host side polling (Paolo).
> >
> >
> >
> First of all, please CC power management patches (including cpuidle,
> cpufreq etc) to linux-pm@xxxxxxxxxxxxxxx (there are people on that
> list who may want to see your changes before they go in) and CC
> cpuidle material (in particular) to Peter Zijlstra.

Ok, Peter is CC'ed, will include linux-pm@vger in the next patches.

> Second, I'm not a big fan of this approach to be honest, as it kind
> of is a driver trying to play the role of a governor.

Well, its not trying to choose which idle state to enter, because
there is only one idle state to enter when virtualized (HLT).

> We have a "polling state" already that could be used here in
> principle so I wonder what would be wrong with that.  

There is no "target residency" concept in the virtualized use-case 
(which is what poll_state.c uses to calculate the poll time).

Moreover the cpuidle-haltpoll driver uses an adaptive logic to
tune poll time (which appparently does not make sense for poll_state).

The only thing they share is the main loop structure:

"while (!need_resched()) {
	cpu_relax();
	now = ktime_get();
}"

> Also note that
> there seems to be at least some code duplication between your code
> and the "polling state" implementation, so maybe it would be
> possible to do some things in a common way?

Again, its just the main loop structure that is shared:
there is no target residency in the virtualized case, 
and we want an adaptive scheme.

Lets think about deduplication: you would have a cpuidle driver,
with a fake "target residency". 

Now, it makes no sense to use a governor for the virtualized case
(again, there is only one idle state: HLT, the host governor is
used for the actual idle state decision in the host).

So i fail to see how i would go about integrating these two 
and what are the advantages of doing so ?