Re: [PATCH] x86: add cpuidle_kvm driver to allow guest side halt polling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17/05/19 19:48, Marcelo Tosatti wrote:
> 
> The cpuidle_kvm driver allows the guest vcpus to poll for a specified
> amount of time before halting. This provides the following benefits
> to host side polling:
> 
> 	1) The POLL flag is set while polling is performed, which allows
> 	   a remote vCPU to avoid sending an IPI (and the associated
>  	   cost of handling the IPI) when performing a wakeup.
> 
> 	2) The HLT VM-exit cost can be avoided.
> 
> The downside of guest side polling is that polling is performed
> even with other runnable tasks in the host.
> 
> Results comparing halt_poll_ns and server/client application
> where a small packet is ping-ponged:
> 
> host                                        --> 31.33	
> halt_poll_ns=300000 / no guest busy spin    --> 33.40	(93.8%)
> halt_poll_ns=0 / guest_halt_poll_ns=300000  --> 32.73	(95.7%)
> 
> For the SAP HANA benchmarks (where idle_spin is a parameter 
> of the previous version of the patch, results should be the
> same):
> 
> hpns == halt_poll_ns
> 
>                           idle_spin=0/   idle_spin=800/	   idle_spin=0/
> 			  hpns=200000    hpns=0            hpns=800000
> DeleteC06T03 (100 thread) 1.76           1.71 (-3%)        1.78	  (+1%)
> InsertC16T02 (100 thread) 2.14     	 2.07 (-3%)        2.18   (+1.8%)
> DeleteC00T01 (1 thread)   1.34 		 1.28 (-4.5%)	   1.29   (-3.7%)
> UpdateC00T03 (1 thread)	  4.72		 4.18 (-12%)	   4.53   (-5%)

Hi Marcelo,

some quick observations:

1) This is actually not KVM-specific, so the name and placement of the
docs should be adjusted.

2) Regarding KVM-specific code, however, we could add an MSR so that KVM
disables halt_poll_ns for this VM when this is active in the guest?

3) The spin time could use the same adaptive algorithm that KVM uses in
the host.

Thanks,

Paolo


> ---
>  Documentation/virtual/kvm/guest-halt-polling.txt |   39 ++++++++
>  arch/x86/Kconfig                                 |    9 +
>  arch/x86/kernel/Makefile                         |    1 
>  arch/x86/kernel/cpuidle_kvm.c                    |  105 +++++++++++++++++++++++
>  arch/x86/kernel/process.c                        |    2 
>  5 files changed, 155 insertions(+), 1 deletion(-)
> 
> Index: linux-2.6.git/arch/x86/Kconfig
> ===================================================================
> --- linux-2.6.git.orig/arch/x86/Kconfig	2019-04-22 13:49:42.858303265 -0300
> +++ linux-2.6.git/arch/x86/Kconfig	2019-05-16 14:18:41.254852745 -0300
> @@ -805,6 +805,15 @@
>  	  underlying device model, the host provides the guest with
>  	  timing infrastructure such as time of day, and system time
>  
> +config KVM_CPUIDLE
> +	tristate "KVM cpuidle driver"
> +	depends on KVM_GUEST
> +	default y
> +	help
> +	  This option enables KVM cpuidle driver, which allows to poll
> +	  before halting in the guest (more efficient than polling in the
> +	  host via halt_poll_ns for some scenarios).
> +
>  config PVH
>  	bool "Support for running PVH guests"
>  	---help---
> Index: linux-2.6.git/arch/x86/kernel/Makefile
> ===================================================================
> --- linux-2.6.git.orig/arch/x86/kernel/Makefile	2019-04-22 13:49:42.869303331 -0300
> +++ linux-2.6.git/arch/x86/kernel/Makefile	2019-05-17 12:59:51.673274881 -0300
> @@ -112,6 +112,7 @@
>  obj-$(CONFIG_DEBUG_NMI_SELFTEST) += nmi_selftest.o
>  
>  obj-$(CONFIG_KVM_GUEST)		+= kvm.o kvmclock.o
> +obj-$(CONFIG_KVM_CPUIDLE)	+= cpuidle_kvm.o
>  obj-$(CONFIG_PARAVIRT)		+= paravirt.o paravirt_patch_$(BITS).o
>  obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o
>  obj-$(CONFIG_PARAVIRT_CLOCK)	+= pvclock.o
> Index: linux-2.6.git/arch/x86/kernel/process.c
> ===================================================================
> --- linux-2.6.git.orig/arch/x86/kernel/process.c	2019-04-22 13:49:42.876303374 -0300
> +++ linux-2.6.git/arch/x86/kernel/process.c	2019-05-17 13:19:18.055435117 -0300
> @@ -580,7 +580,7 @@
>  	safe_halt();
>  	trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
>  }
> -#ifdef CONFIG_APM_MODULE
> +#if defined(CONFIG_APM_MODULE) || defined(CONFIG_KVM_CPUIDLE_MODULE)
>  EXPORT_SYMBOL(default_idle);
>  #endif
>  
> Index: linux-2.6.git/arch/x86/kernel/cpuidle_kvm.c
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6.git/arch/x86/kernel/cpuidle_kvm.c	2019-05-17 13:38:02.553941356 -0300
> @@ -0,0 +1,105 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * cpuidle driver for KVM guests.
> + *
> + * Copyright 2019 Red Hat, Inc. and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + * Authors: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
> + */
> +
> +#include <linux/init.h>
> +#include <linux/cpuidle.h>
> +#include <linux/module.h>
> +#include <linux/timekeeping.h>
> +#include <linux/sched/idle.h>
> +
> +unsigned int guest_halt_poll_ns;
> +module_param(guest_halt_poll_ns, uint, 0644);
> +
> +static int kvm_enter_idle(struct cpuidle_device *dev,
> +			  struct cpuidle_driver *drv, int index)
> +{
> +	int do_halt = 0;
> +
> +	/* No polling */
> +	if (guest_halt_poll_ns == 0) {
> +		if (current_clr_polling_and_test()) {
> +			local_irq_enable();
> +			return index;
> +		}
> +		default_idle();
> +		return index;
> +	}
> +
> +	local_irq_enable();
> +	if (!current_set_polling_and_test()) {
> +		ktime_t now, end_spin;
> +
> +		now = ktime_get();
> +		end_spin = ktime_add_ns(now, guest_halt_poll_ns);
> +
> +		while (!need_resched()) {
> +			cpu_relax();
> +			now = ktime_get();
> +
> +			if (!ktime_before(now, end_spin)) {
> +				do_halt = 1;
> +				break;
> +			}
> +		}
> +	}
> +
> +	if (do_halt) {
> +		/*
> +		 * No events while busy spin window passed,
> +		 * halt.
> +		 */
> +		local_irq_disable();
> +		if (current_clr_polling_and_test()) {
> +			local_irq_enable();
> +			return index;
> +		}
> +		default_idle();
> +	} else {
> +		current_clr_polling();
> +	}
> +
> +	return index;
> +}
> +
> +static struct cpuidle_driver kvm_idle_driver = {
> +	.name = "kvm_idle",
> +	.owner = THIS_MODULE,
> +	.states = {
> +		{ /* entry 0 is for polling */ },
> +		{
> +			.enter			= kvm_enter_idle,
> +			.exit_latency		= 0,
> +			.target_residency	= 0,
> +			.power_usage		= -1,
> +			.name			= "KVM",
> +			.desc			= "KVM idle",
> +		},
> +	},
> +	.safe_state_index = 0,
> +	.state_count = 2,
> +};
> +
> +static int __init kvm_cpuidle_init(void)
> +{
> +	return cpuidle_register(&kvm_idle_driver, NULL);
> +}
> +
> +static void __exit kvm_cpuidle_exit(void)
> +{
> +	cpuidle_unregister(&kvm_idle_driver);
> +}
> +
> +module_init(kvm_cpuidle_init);
> +module_exit(kvm_cpuidle_exit);
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Marcelo Tosatti <mtosatti@xxxxxxxxxx>");
> +
> Index: linux-2.6.git/Documentation/virtual/kvm/guest-halt-polling.txt
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6.git/Documentation/virtual/kvm/guest-halt-polling.txt	2019-05-17 13:36:39.274703710 -0300
> @@ -0,0 +1,39 @@
> +KVM guest halt polling
> +======================
> +
> +The cpuidle_kvm driver allows the guest vcpus to poll for a specified
> +amount of time before halting. This provides the following benefits
> +to host side polling:
> +
> +	1) The POLL flag is set while polling is performed, which allows
> +	   a remote vCPU to avoid sending an IPI (and the associated
> + 	   cost of handling the IPI) when performing a wakeup.
> +
> +	2) The HLT VM-exit cost can be avoided.
> +
> +The downside of guest side polling is that polling is performed
> +even with other runnable tasks in the host.
> +
> +Module Parameters
> +=================
> +
> +The cpuidle_kvm module has 1 tuneable module parameter: guest_halt_poll_ns,
> +the amount of time, in nanoseconds, that polling is performed before
> +halting.
> +
> +This module parameter can be set from the debugfs files in:
> +
> +	/sys/module/cpuidle_kvm/parameters/
> +
> +Further Notes
> +=============
> +
> +- Care should be taken when setting the guest_halt_poll_ns parameter as a
> +large value has the potential to drive the cpu usage to 100% on a machine which
> +would be almost entirely idle otherwise.
> +
> +- The effective amount of time that polling is performed is the host poll
> +value (see halt-polling.txt) plus guest_halt_poll_ns. If all guests
> +on a host system support and have properly configured guest_halt_poll_ns,
> +then setting halt_poll_ns to 0 in the host is probably the best choice.
> +
> 




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux