On Mon, Sep 02, 2013 at 11:06:53AM +0200, Jan Kiszka wrote: > On 2013-09-02 10:21, Gleb Natapov wrote: > > On Thu, Aug 08, 2013 at 04:26:28PM +0200, Jan Kiszka wrote: > >> Likely a typo, but a fatal one as kvm_set_cr0 performs checks on the > > Not a typo :) That what Avi asked for do during initial nested VMX > > review: http://markmail.org/message/hhidqyhbo2mrgxxc > > Yeah, should rephrase this. > > > > > But there is at least one transition check that kvm_set_cr0() does that > > should not be done during vmexit emulation, namely CS.L bit check, so I > > tend to agree that kvm_set_cr0() is not appropriate here, at lest not as > > it is. > > kvm_set_cr0() is for emulating explicit guest changes. It is not the > proper interface for implicit, vendor-dependent changes like this one. > Agree, the problem is that we do not have proper interface for implicit changes like this one (do not see why it is vendor-dependent, SVM also restores host state in a similar way). > > But can we skip other checks kvm_set_cr0() does? For instance > > what prevents us from loading CR0.PG = 1 EFER.LME = 1 and CR4.PAE = 0 > > during nested vmexit? What _should_ prevent it is vmentry check from > > 26.2.4 > > > > If the "host address-space size" VM-exit control is 1, the following > > must hold: > > - Bit 5 of the CR4 field (corresponding to CR4.PAE) is 1. > > > > But I do not see that we do that check on vmentry. > > > > What about NW/CD bit checks, or reserved bits checks? 27.5.1 says: > > The following bits are not modified: > > For CR0, ET, CD, NW; bits 63:32 (on processors that support Intel 64 > > architecture), 28:19, 17, and 15:6; and any bits that are fixed in > > VMX operation (see Section 23.8). > > > > But again current vmexit code does not emulate this properly and just > > sets everything from host_cr0. vmentry should also preserve all those > > bit but it looks like it doesn't too. > > > > Yes, there is surely more to improve. Do you think the lacking checks > can cause troubles for L0, or is this just imprecise emulation that can > be addressed separately? > The lacking checks may cause L0 to fail guest entry which will trigger internal error. If it is exploitable by L0 userspace it is a serious problem, if only L0 kernel can trigger it then less so. I remember Avi was concerned that KVM code may depend on all registers to be consistent otherwise it can be exploited, I cannot prove or disprove this theory :), but if it is the case then event L0 kernel case is problematic. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html