> From: Nadav Har'El > Sent: Tuesday, May 17, 2011 3:54 AM > > This patch implements nested_vmx_vmexit(), called when the nested L2 guest > exits and we want to run its L1 parent and let it handle this exit. > > Note that this will not necessarily be called on every L2 exit. L0 may decide > to handle a particular exit on its own, without L1's involvement; In that > case, L0 will handle the exit, and resume running L2, without running L1 and > without calling nested_vmx_vmexit(). The logic for deciding whether to handle > a particular exit in L1 or in L0, i.e., whether to call nested_vmx_vmexit(), > will appear in a separate patch below. > > Signed-off-by: Nadav Har'El <nyh@xxxxxxxxxx> > --- > arch/x86/kvm/vmx.c | 257 > +++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 257 insertions(+) > > --- .before/arch/x86/kvm/vmx.c 2011-05-16 22:36:49.000000000 +0300 > +++ .after/arch/x86/kvm/vmx.c 2011-05-16 22:36:49.000000000 +0300 > @@ -6203,6 +6203,263 @@ static int nested_vmx_run(struct kvm_vcp > return 1; > } > > +/* > + * On a nested exit from L2 to L1, vmcs12.guest_cr0 might not be up-to-date > + * because L2 may have changed some cr0 bits directly (see > CRO_GUEST_HOST_MASK) > + * without L0 trapping the change and updating vmcs12. > + * This function returns the value we should put in vmcs12.guest_cr0. It's not > + * enough to just return the current (vmcs02) GUEST_CR0 - that may not be > the > + * guest cr0 that L1 thought it was giving its L2 guest; It is possible that > + * L1 wished to allow its guest to set some cr0 bit directly, but we (L0) asked > + * to trap this change and instead set just the read shadow bit. If this is the > + * case, we need to copy these read-shadow bits back to vmcs12.guest_cr0, > where > + * L1 believes they already are. > + */ > +static inline unsigned long > +vmcs12_guest_cr0(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) > +{ > + /* > + * As explained above, we take a bit from GUEST_CR0 if we allowed the > + * guest to modify it untrapped (vcpu->arch.cr0_guest_owned_bits), or > + * if we did trap it - if we did so because L1 asked to trap this bit > + * (vmcs12->cr0_guest_host_mask). Otherwise (bits we trapped but L1 > + * didn't expect us to trap) we read from CR0_READ_SHADOW. > + */ > + unsigned long guest_cr0_bits = > + vcpu->arch.cr0_guest_owned_bits | vmcs12->cr0_guest_host_mask; > + return (vmcs_readl(GUEST_CR0) & guest_cr0_bits) | > + (vmcs_readl(CR0_READ_SHADOW) & ~guest_cr0_bits); > +} Hi, Nadav, Not sure whether I get above operation wrong. But it looks not exactly correct to me in a glimpse. Say a bit set both in L0/L1's cr0_guest_host_mask. In such case that bit from vmcs12_GUEST_CR0 resides in vmcs02_CR0_READ_SHADOW, however above operation will make vmcs02_GUEST_CR0 bit returned instead. Instead of constructing vmcs12_GUEST_CR0 completely from vmcs02_GUEST_CR0, why not just updating bits which can be altered while keeping the rest bits from vmcs12_GUEST_CR0? Say something like: vmcs12->guest_cr0 &= vmcs12->cr0_guest_host_mask; /* keep unchanged bits */ vmcs12->guest_cr0 |= (vmcs_readl(GUEST_CR0) & vcpu->arch.cr0_guest_owned_bits) | (vmcs_readl(CR0_READ_SHADOW) & ~( vcpu->arch.cr0_guest_owned_bits | vmcs12->cr0_guest_host_mask)) Thanks Kevin -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html