On Wed, Feb 15, 2017 at 9:37 AM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > > On 15/02/2017 17:58, Jim Mattson wrote: >> Flushing the VMCS cache on L2->userspace exit solves (2), but it means >> that L2->userspace exit is not transparent to the guest. Is this >> better than (3)? > > Yes, I think it is better. The really nasty part of > "KVM_GET_NESTED_STATE is not transparent to the guest" is that > KVM_GET_NESTED_STATE typically happens while the guest is quiescent and > userspace doesn't expect guest memory to change. (2) is a special case > of this. > > Instead, flushing the VMCS on L2->userspace exit happens during KVM_RUN, > which should not break any invariants that userspace relies on. It is certainly within the bounds of the Intel specification to flush the VMCS cache at any time, but I think it would be better if the VCPU behaved more deterministically (with respect to guest-visible events). >> There is some redundancy in the VMCS cache state currently saved by >> GET_NESTED_STATE and the VCPU state saved by GET_SREGS and others, but >> it does simplify the vmcs02 initialization. (As an aside, this might >> be a lot easier if we eliminated the vmcs02 entirely, as I've >> suggested in the past.) > > Do you have patches to eliminate the vmcs02 or was it just a proposal? Just a proposal. I think a separate vmcs02 has the potential for faster emulated VM-entry/VM-exit, but at the cost of greater code complexity. A unified vmcs0X could certainly be as fast as the existing implementation and a lot simpler. > > As things stand now, almost all of the vmcs01 has to be reloaded anyway > from the vmcs12's host state. But as I said earlier we could be > different if we compared vmcs01 and vmcs12's state (especially the > descriptor caches) and optimize load_vmcs12_host_state consequently. > Somebody needs to write the code and benchmark it with vmexit.flat... > >> To be honest, I'm not sure what it is that you are trying to optimize >> for. Is your concern the extra 4K of VCPU state? I would argue that >> the VMCS cache is part of the physical CPU state (though on a physical >> CPU, not every field is cached), and that it therefore makes sense to >> include the VMCS cache as part of the VCPU state, apart from guest >> physical memory. > > It's not really the extra 4K, but rather the ugly "blob" aspect of it > and having to maintain compatibility for it. Regarding the blobbiness, > see for example KVM_GET/SET_XSAVE, where you can actually marshal and > unmarshal the contents of the XSAVE area into the registers. Regarding > compatibility, I'm worried that once someone looks at SVM more closely, > we will have to store the VMCB in GET_NESTED_STATE too. Instead of a "blob," we could pass it out to userspace as a 'struct vmcs12,' padded out to 4k for compatibility purposes. Then userspace could marshal and unmarshal the contents, and it would be a little easier than marshalling and unmarshalling the same contents in guest physical memory. I don't know what to say about SVM. I think we should be prepared to store the VMCB at some point, even if we don't now. BTW, eventually I'm going to want to have two entries in the VMCS cache to improve the performance of virtualized VMCS shadowing. > > Paolo > >> On Wed, Feb 15, 2017 at 8:14 AM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: >>> >>> >>> On 15/02/2017 17:06, Jim Mattson wrote: >>>> The VMCS cache can be safely flushed to guest memory at any time. >>>> However, I think your proposal has some unfortunate consequences: >>>> >>>> 1. If KVM_SET_NESTED_STATE is asynchronous, then any subsequent set >>>> operations (e.g. KVM_SET_SREGS) may be overridden on the next KVM_RUN. >>>> 2. Guest memory (at least the cached VMCS page(s)) has to be saved >>>> after KVM_GET_NESTED_STATE. >>>> 3. KVM_GET_NESTED_STATE is not transparent to the guest. >>> >>> I can't choose which is the worst of the three. :) >>> >>> A better one perhaps is to flush the VMCS cache on L2->userspace exit, >>> since that should be pretty rare (suggested by David). I think that >>> would help at least with (2) and (3). >>> >>> As to (1), after KVM_SET_NESTED_STATE sets the in-guest-mode flag you >>> don't really need to reload all of the vmcs12 into vmcs02. Only the >>> host state needs to be reloaded, while the guest state is set with >>> KVM_SET_SREGS and others. >>> >>> Paolo