On 04/13/2015 08:41 PM, Avi Kivity wrote:
On 04/13/2015 08:35 PM, Jan Kiszka wrote:
On 2015-04-13 19:29, Avi Kivity wrote:
On 04/13/2015 10:01 AM, Jan Kiszka wrote:
On 2015-04-07 07:43, Jan Kiszka wrote:
On 2015-04-05 19:12, Valentine Sinitsyn wrote:
Hi Jan,
On 05.04.2015 13:31, Jan Kiszka wrote:
studying the VM exit logic of Jailhouse, I was wondering when AMD's
vmload/vmsave can be avoided. Jailhouse as well as KVM currently
use
these instructions unconditionally. However, I think both only need
GS.base, i.e. the per-cpu base address, to be saved and restored
if no
user space exit or no CPU migration is involved (both is always
true for
Jailhouse). Xen avoids vmload/vmsave on lightweight exits but it
also
still uses rsp-based per-cpu variables.
So the question boils down to what is generally faster:
A) vmload
vmrun
vmsave
B) wrmsrl(MSR_GS_BASE, guest_gs_base)
vmrun
rdmsrl(MSR_GS_BASE, guest_gs_base)
Of course, KVM also has to take into account that heavyweight exits
still require vmload/vmsave, thus become more expensive with B)
due to
the additional MSR accesses.
Any thoughts or results of previous experiments?
That's a good question, I also thought about it when I was
finalizing
Jailhouse AMD port. I tried "lightweight exits" with apic-demo
but it
didn't seem to affect the latency in any noticeable way. That's
why I
decided not to push the patch (in fact, I was even unable to find it
now).
Note however that how AMD chips store host state during VM
switches are
implementation-specific. I did my quick experiments on one CPU
only, so
your mileage may vary.
Regarding your question, I feel B will be faster anyways but
again I'm
afraid that the gain could be within statistical error of the
experiment.
It is, at least 160 cycles with hot caches on an AMD A6-5200 APU,
more
towards 600 if they are colder (added some usleep to each loop in the
test).
I've tested via vmmcall from guest userspace under Jailhouse. KVM
should
be adjustable in a similar way. Attached the benchmark, patch will
be in
the Jailhouse next branch soon. We need to check more CPU types,
though.
Avi, I found some preparatory patches of yours from 2010 [1]. Do you
happen to remember if it was never completed for a technical reason?
IIRC, I came to the conclusion that it was impossible. Something about
TR.size not receiving a reasonable value. Let me see.
To my understanding, TR doesn't play a role until we leave ring 0 again.
Or what could make the CPU look for any of the fields in the 64-bit TSS
before that?
Exceptions that utilize the IST. I found a writeup [17] that
describes this, but I think it's even more impossible than that
writeup implies.
I think that Xen does (or did) something along the lines of disabling
IST usage (by playing with the descriptors in the IDT) and then
re-enabling them when exiting to userspace.
[17] http://thread.gmane.org/gmane.comp.emulators.kvm.devel/26712/
Jan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html