On 04/07/19 10:32, Rafael J. Wysocki wrote: > On Thu, Jul 4, 2019 at 1:59 AM Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: >> >> (rebased against queue branch of kvm.git tree) >> >> The cpuidle-haltpoll driver with haltpoll governor allows the guest >> vcpus to poll for a specified amount of time before halting. >> This provides the following benefits to host side polling: >> >> 1) The POLL flag is set while polling is performed, which allows >> a remote vCPU to avoid sending an IPI (and the associated >> cost of handling the IPI) when performing a wakeup. >> >> 2) The VM-exit cost can be avoided. >> >> The downside of guest side polling is that polling is performed >> even with other runnable tasks in the host. >> >> Results comparing halt_poll_ns and server/client application >> where a small packet is ping-ponged: >> >> host --> 31.33 >> halt_poll_ns=300000 / no guest busy spin --> 33.40 (93.8%) >> halt_poll_ns=0 / guest_halt_poll_ns=300000 --> 32.73 (95.7%) >> >> For the SAP HANA benchmarks (where idle_spin is a parameter >> of the previous version of the patch, results should be the >> same): >> >> hpns == halt_poll_ns >> >> idle_spin=0/ idle_spin=800/ idle_spin=0/ >> hpns=200000 hpns=0 hpns=800000 >> DeleteC06T03 (100 thread) 1.76 1.71 (-3%) 1.78 (+1%) >> InsertC16T02 (100 thread) 2.14 2.07 (-3%) 2.18 (+1.8%) >> DeleteC00T01 (1 thread) 1.34 1.28 (-4.5%) 1.29 (-3.7%) >> UpdateC00T03 (1 thread) 4.72 4.18 (-12%) 4.53 (-5%) >> >> V2: >> >> - Move from x86 to generic code (Paolo/Christian) >> - Add auto-tuning logic (Paolo) >> - Add MSR to disable host side polling (Paolo) >> >> V3: >> >> - Do not be specific about HLT VM-exit in the documentation (Ankur Arora) >> - Mark tuning parameters static and __read_mostly (Andrea Arcangeli) >> - Add WARN_ON if host does not support poll control (Joao Martins) >> - Use sched_clock and cleanup haltpoll_enter_idle (Peter Zijlstra) >> - Mark certain functions in kvm.c as static (kernel test robot) >> - Remove tracepoints as they use RCU from extended quiescent state (kernel >> test robot) >> >> V4: >> - Use a haltpoll governor, use poll_state.c poll code (Rafael J. Wysocki) >> >> V5: >> - Take latency requirement into consideration (Rafael J. Wysocki) >> - Set target_residency/exit_latency to 1 (Rafael J. Wysocki) >> - Do not load cpuidle driver if not virtualized (Rafael J. Wysocki) >> >> V6: >> - Switch from callback to poll_limit_ns variable in cpuidle device structure >> (Rafael J. Wysocki) >> - Move last_used_idx to cpuidle device structure (Rafael J. Wysocki) >> - Drop per-cpu device structure in haltpoll governor (Rafael J. Wysocki) > > It looks good to me now, but I have some cpuidle changes in the work > that will clash in some changes in this series if not rebased on top > of it, so IMO it would make sense for me to get patches [1-4/5] at > least into my queue. I can expose an immutable branch with them for > the KVM tree to consume. I can take the last patch in the series as > well if I get an ACK for it. > > Would that work for everybody? Rafael, please take the whole series in your tree. Thanks! Paolo