On 6/4/2019 12:52 AM, Marcelo Tosatti wrote:
The cpuidle-haltpoll driver allows the guest vcpus to poll for a specified
amount of time before halting. This provides the following benefits
to host side polling:
1) The POLL flag is set while polling is performed, which allows
a remote vCPU to avoid sending an IPI (and the associated
cost of handling the IPI) when performing a wakeup.
2) The HLT VM-exit cost can be avoided.
The downside of guest side polling is that polling is performed
even with other runnable tasks in the host.
Results comparing halt_poll_ns and server/client application
where a small packet is ping-ponged:
host --> 31.33
halt_poll_ns=300000 / no guest busy spin --> 33.40 (93.8%)
halt_poll_ns=0 / guest_halt_poll_ns=300000 --> 32.73 (95.7%)
For the SAP HANA benchmarks (where idle_spin is a parameter
of the previous version of the patch, results should be the
same):
hpns == halt_poll_ns
idle_spin=0/ idle_spin=800/ idle_spin=0/
hpns=200000 hpns=0 hpns=800000
DeleteC06T03 (100 thread) 1.76 1.71 (-3%) 1.78 (+1%)
InsertC16T02 (100 thread) 2.14 2.07 (-3%) 2.18 (+1.8%)
DeleteC00T01 (1 thread) 1.34 1.28 (-4.5%) 1.29 (-3.7%)
UpdateC00T03 (1 thread) 4.72 4.18 (-12%) 4.53 (-5%)
V2:
- Move from x86 to generic code (Paolo/Christian).
- Add auto-tuning logic (Paolo).
- Add MSR to disable host side polling (Paolo).
First of all, please CC power management patches (including cpuidle,
cpufreq etc) to linux-pm@xxxxxxxxxxxxxxx (there are people on that list
who may want to see your changes before they go in) and CC cpuidle
material (in particular) to Peter Zijlstra.
Second, I'm not a big fan of this approach to be honest, as it kind of
is a driver trying to play the role of a governor.
We have a "polling state" already that could be used here in principle
so I wonder what would be wrong with that. Also note that there seems
to be at least some code duplication between your code and the "polling
state" implementation, so maybe it would be possible to do some things
in a common way?