On Tue, May 24, 2011, Tian, Kevin wrote about "RE: [PATCH 20/31] nVMX: Exiting from L2 to L1": > > +vmcs12_guest_cr0(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) > > +{ > > + /* > > + * As explained above, we take a bit from GUEST_CR0 if we allowed the > > + * guest to modify it untrapped (vcpu->arch.cr0_guest_owned_bits), or > > + * if we did trap it - if we did so because L1 asked to trap this bit > > + * (vmcs12->cr0_guest_host_mask). Otherwise (bits we trapped but L1 > > + * didn't expect us to trap) we read from CR0_READ_SHADOW. > > + */ > > + unsigned long guest_cr0_bits = > > + vcpu->arch.cr0_guest_owned_bits | vmcs12->cr0_guest_host_mask; > > + return (vmcs_readl(GUEST_CR0) & guest_cr0_bits) | > > + (vmcs_readl(CR0_READ_SHADOW) & ~guest_cr0_bits); > > +} > > Hi, Nadav, > > Not sure whether I get above operation wrong. This is one of the trickiest functions in nested VMX, which is why I added 15 lines of comments (!) on just two statements of code. > But it looks not exactly correct to me > in a glimpse. Say a bit set both in L0/L1's cr0_guest_host_mask. In such case that > bit from vmcs12_GUEST_CR0 resides in vmcs02_CR0_READ_SHADOW, however above > operation will make vmcs02_GUEST_CR0 bit returned instead. This behavior is correct: If a bit is set in L1's cr0_guest_host_mask (and in particular, if it is set in both L0's and L1's), we always exit to L1 when L2 changes this bit, and this bit cannot change while L2 is running, so naturally after the run vmcs02.guest_cr0 and vmcs12.guest_cr0 are still identical in that be. Copying that bit from vmcs02_CR0_READ_SHADOW, like you suggested, would be completely wrong in this case: When L1 set a bit in cr0_guest_host_mask, the vmcs02->cr0_read_shadow is vmcs12->cr0_read_shadow (see nested_read_cr0), and is just a pretense that L1 set up for L2 - it is NOT the real bit of guest_cr0, so copying it into guest_cr0 would be wrong. Note that this function is completely different from nested_read_cr0 (the new name), which behaves similar to what you suggested but serves a completely different (and in some respect, opposite) function. I think my comments in the code are clearer than what I just wrote here, so please take a look at them again, and let me know if you find any errors. > Instead of constructing vmcs12_GUEST_CR0 completely from vmcs02_GUEST_CR0, > why not just updating bits which can be altered while keeping the rest bits from > vmcs12_GUEST_CR0? Say something like: > > vmcs12->guest_cr0 &= vmcs12->cr0_guest_host_mask; /* keep unchanged bits */ > vmcs12->guest_cr0 |= (vmcs_readl(GUEST_CR0) & vcpu->arch.cr0_guest_owned_bits) | > (vmcs_readl(CR0_READ_SHADOW) & ~( vcpu->arch.cr0_guest_owned_bits | vmcs12->cr0_guest_host_mask)) I guess I could do something like this, but do you think it's clearer? I don't. Behind all the details, my formula emphasises that MOST cr0 bits can be just copied from vmcs02 to vmcs12 as is - and we only have to do something strange for special bits - where L0 wanted to trap but L1 didn't. In your formula, it looks like there are 3 different cases instead of 2. In any case, your formula is definitely not more correct, because the formulas are in fact equivalent - let me prove: If, instead of taking the "unchanged bits" (as you call them) from vmcs12->guest_cr0, you take them from vmcs02->guest_cr0 (you can, because they couldn't have changed), you end up with *exactly* the same formula I used. Here is the proof: yourformula = (vmcs12->guest_cr0 & vmcs12->cr0_guest_host_mask) | (vmcs_readl(GUEST_CR0) & vcpu->arch.cr0_guest_owned_bits) | (vmcs_readl(CR0_READ_SHADOW) & ~( vcpu->arch.cr0_guest_owned_bits | vmcs12->cr0_guest_host_mask)) Now because of the "unchanged bits", (vmcs12->guest_cr0 & vmcs12->cr0_guest_host_mask) == (vmcs02->guest_cr0 & vmcs12->cr0_guest_host_mask) == (and note that vmcs02->guest_cr0 is vmcs_readl(GUEST_CR0)) so this in yourformula, it becomes yourformula = (vmcs_readl(GUEST_CR0) & vmcs12->cr0_guest_host_mask) | (vmcs_readl(GUEST_CR0) & vcpu->arch.cr0_guest_owned_bits) | (vmcs_readl(CR0_READ_SHADOW) & ~( vcpu->arch.cr0_guest_owned_bits | vmcs12->cr0_guest_host_mask)) or, simplifying yourformula = (vmcs_readl(GUEST_CR0) & (vmcs12->cr0_guest_host_mask | vcpu->arch.cr0_guest_owned_bits) | (vmcs_readl(CR0_READ_SHADOW) & ~( vcpu->arch.cr0_guest_owned_bits | vmcs12->cr0_guest_host_mask)) now, using the name I used: unsigned long guest_cr0_bits = vcpu->arch.cr0_guest_owned_bits | vmcs12->cr0_guest_host_mask; you end up with yourforumula = (vmcs_readl(GUEST_CR0) & guest_cr0_bits) | (vmcs_readl(CR0_READ_SHADOW) & ~guest_cr0_bits ) Which is, believe it or not, exactly my formula :-) -- Nadav Har'El | Tuesday, May 24 2011, 20 Iyyar 5771 nyh@xxxxxxxxxxxxxxxxxxx |----------------------------------------- Phone +972-523-790466, ICQ 13349191 |From the Linux getopt(3) manpage: "BUGS: http://nadav.harel.org.il |This manpage is confusing." -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html