On 2015-04-13 19:48, Avi Kivity wrote: > On 04/13/2015 08:41 PM, Avi Kivity wrote: >> On 04/13/2015 08:35 PM, Jan Kiszka wrote: >>> On 2015-04-13 19:29, Avi Kivity wrote: >>>> On 04/13/2015 10:01 AM, Jan Kiszka wrote: >>>>> On 2015-04-07 07:43, Jan Kiszka wrote: >>>>>> On 2015-04-05 19:12, Valentine Sinitsyn wrote: >>>>>>> Hi Jan, >>>>>>> >>>>>>> On 05.04.2015 13:31, Jan Kiszka wrote: >>>>>>>> studying the VM exit logic of Jailhouse, I was wondering when AMD's >>>>>>>> vmload/vmsave can be avoided. Jailhouse as well as KVM currently >>>>>>>> use >>>>>>>> these instructions unconditionally. However, I think both only need >>>>>>>> GS.base, i.e. the per-cpu base address, to be saved and restored >>>>>>>> if no >>>>>>>> user space exit or no CPU migration is involved (both is always >>>>>>>> true for >>>>>>>> Jailhouse). Xen avoids vmload/vmsave on lightweight exits but it >>>>>>>> also >>>>>>>> still uses rsp-based per-cpu variables. >>>>>>>> >>>>>>>> So the question boils down to what is generally faster: >>>>>>>> >>>>>>>> A) vmload >>>>>>>> vmrun >>>>>>>> vmsave >>>>>>>> >>>>>>>> B) wrmsrl(MSR_GS_BASE, guest_gs_base) >>>>>>>> vmrun >>>>>>>> rdmsrl(MSR_GS_BASE, guest_gs_base) >>>>>>>> >>>>>>>> Of course, KVM also has to take into account that heavyweight exits >>>>>>>> still require vmload/vmsave, thus become more expensive with B) >>>>>>>> due to >>>>>>>> the additional MSR accesses. >>>>>>>> >>>>>>>> Any thoughts or results of previous experiments? >>>>>>> That's a good question, I also thought about it when I was >>>>>>> finalizing >>>>>>> Jailhouse AMD port. I tried "lightweight exits" with apic-demo >>>>>>> but it >>>>>>> didn't seem to affect the latency in any noticeable way. That's >>>>>>> why I >>>>>>> decided not to push the patch (in fact, I was even unable to find it >>>>>>> now). >>>>>>> >>>>>>> Note however that how AMD chips store host state during VM >>>>>>> switches are >>>>>>> implementation-specific. I did my quick experiments on one CPU >>>>>>> only, so >>>>>>> your mileage may vary. >>>>>>> >>>>>>> Regarding your question, I feel B will be faster anyways but >>>>>>> again I'm >>>>>>> afraid that the gain could be within statistical error of the >>>>>>> experiment. >>>>>> It is, at least 160 cycles with hot caches on an AMD A6-5200 APU, >>>>>> more >>>>>> towards 600 if they are colder (added some usleep to each loop in the >>>>>> test). >>>>>> >>>>>> I've tested via vmmcall from guest userspace under Jailhouse. KVM >>>>>> should >>>>>> be adjustable in a similar way. Attached the benchmark, patch will >>>>>> be in >>>>>> the Jailhouse next branch soon. We need to check more CPU types, >>>>>> though. >>>>> Avi, I found some preparatory patches of yours from 2010 [1]. Do you >>>>> happen to remember if it was never completed for a technical reason? >>>> IIRC, I came to the conclusion that it was impossible. Something about >>>> TR.size not receiving a reasonable value. Let me see. >>> To my understanding, TR doesn't play a role until we leave ring 0 again. >>> Or what could make the CPU look for any of the fields in the 64-bit TSS >>> before that? >> >> Exceptions that utilize the IST. I found a writeup [17] that >> describes this, but I think it's even more impossible than that >> writeup implies. >> > > I think that Xen does (or did) something along the lines of disabling > IST usage (by playing with the descriptors in the IDT) and then > re-enabling them when exiting to userspace. So we would reuse that active stack for the current IST users until then. But I bet there are subtle details that prevent a simple switch at IDT level. Hmm, no low-hanging fruit it seems... > > >> [17] http://thread.gmane.org/gmane.comp.emulators.kvm.devel/26712/ That thread proposed the complete IST removal. But, given that we still have it 7 years later, I suppose that was not very welcome in general. Thanks, Jan PS: For the Jailhouse readers: we don't use IST. -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html