Re: SVM: vmload/vmsave-free VM exits?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2015-04-05 19:12, Valentine Sinitsyn wrote:
> Hi Jan,
> 
> On 05.04.2015 13:31, Jan Kiszka wrote:
>> studying the VM exit logic of Jailhouse, I was wondering when AMD's
>> vmload/vmsave can be avoided. Jailhouse as well as KVM currently use
>> these instructions unconditionally. However, I think both only need
>> GS.base, i.e. the per-cpu base address, to be saved and restored if no
>> user space exit or no CPU migration is involved (both is always true for
>> Jailhouse). Xen avoids vmload/vmsave on lightweight exits but it also
>> still uses rsp-based per-cpu variables.
>>
>> So the question boils down to what is generally faster:
>>
>> A) vmload
>>     vmrun
>>     vmsave
>>
>> B) wrmsrl(MSR_GS_BASE, guest_gs_base)
>>     vmrun
>>     rdmsrl(MSR_GS_BASE, guest_gs_base)
>>
>> Of course, KVM also has to take into account that heavyweight exits
>> still require vmload/vmsave, thus become more expensive with B) due to
>> the additional MSR accesses.
>>
>> Any thoughts or results of previous experiments?
> That's a good question, I also thought about it when I was finalizing
> Jailhouse AMD port. I tried "lightweight exits" with apic-demo but it
> didn't seem to affect the latency in any noticeable way. That's why I
> decided not to push the patch (in fact, I was even unable to find it now).
> 
> Note however that how AMD chips store host state during VM switches are
> implementation-specific. I did my quick experiments on one CPU only, so
> your mileage may vary.
> 
> Regarding your question, I feel B will be faster anyways but again I'm
> afraid that the gain could be within statistical error of the experiment.

It is, at least 160 cycles with hot caches on an AMD A6-5200 APU, more
towards 600 if they are colder (added some usleep to each loop in the test).

I've tested via vmmcall from guest userspace under Jailhouse. KVM should
be adjustable in a similar way. Attached the benchmark, patch will be in
the Jailhouse next branch soon. We need to check more CPU types, though.

Jan

/*
 * VM exit benchmark using a hypercall
 *
 * Copyright (c) Siemens AG, 2015
 *
 * Authors:
 *  Jan Kiszka <jan.kiszka@xxxxxxxxxxx>
 *
 * This work is licensed under the terms of the GNU GPL, version 2.  See
 * the COPYING file in the top-level directory.
 */

#ifndef __x86_64__
#error only x86-64 supported
#endif

#include <stdbool.h>
#include <stdio.h>

#define LOOPS			1000000

#define X86_FEATURE_VMX		(1UL << 5)

static inline unsigned long cpuid_ecx(void)
{
	unsigned long val;

	asm volatile("cpuid" : "=c" (val) : "a" (1) : "ebx", "edx");
	return val;
}

static inline __attribute__((always_inline)) unsigned long long read_tsc(void)
{
	unsigned long long hi, lo;

	asm volatile("rdtsc" : "=a" (lo), "=d" (hi));
	return (hi << 32) | lo;
}

int main(int argc, char *argv[])
{
	bool use_vmcall = !!(cpuid_ecx() & X86_FEATURE_VMX);
	unsigned long long start, sum = 0;
	unsigned int n;

	for (n = 0; n < LOOPS; n++) {
		if (use_vmcall) {
			start = read_tsc();
			asm volatile("vmcall" : : "a" (-1));
			sum += read_tsc() - start;
		} else {
			start = read_tsc();
			asm volatile("vmmcall" : : "a" (-1));
			sum += read_tsc() - start;
		}
	}
	printf("Null hypercall: %llu cycles\n", sum / LOOPS);

	return 0;
}

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux