On 2013-03-04 16:30, Nadav Har'El wrote: > On Mon, Mar 04, 2013, Jan Kiszka wrote about "Re: [PATCH] KVM: nVMX: Fix setting of CR0 and CR4 in guest mode": >>>>>> if (is_guest_mode(vcpu)) { >>>>>> - /* >>>>>> - * We get here when L2 changed cr0 in a way that did not change >>>>>> - * any of L1's shadowed bits (see nested_vmx_exit_handled_cr), >>>>>> - * but did change L0 shadowed bits. This can currently happen >>>>>> - * with the TS bit: L0 may want to leave TS on (for lazy fpu >>>>>> - * loading) while pretending to allow the guest to change it. >>>>>> - */ >>>>> Can't say I understand this patch yet, but it looks like the comment is >>>>> still valid. Why have you removed it? >>>> >>>> L0 allows L1 or L2 at most to own TS, the rest is host-owned. I think >>>> the comment was always misleading. >>>> >>> I do not see how it is misleading. For everything but TS we will not get >>> here (if L1 is kvm). For TS we will get here if L1 allows L2 to change >>> it, but L0 does not. >> >> For everything *but guest-owned* we will get here, thus for most CR0 >> accesses (bit-wise, not regarding frequency). > > For most CR0 bits, L1 (at least, a KVM one) will shadow (trap) them, so > we won't get to this point you modified at all... Instead, > nested_vmx_exit_handled_cr() would notice that a shadowed-by-L1 bit > was modified so an exit to L1 is required. We only get to that code > you changed if a bit was modified that L1 did *not* want to trap, but L0 did. > This is definitely not the bit-wise majority of the cases - unless you > have an L1 that does not trap most of the CR0 bits. > > But I'm more worried about the actual code change :-) I didn't > understand if there's a situation where the existing code did something > wrong, or why it was wrong. Did you check the lazy-FPU-loading (TS bit) > aspect of your new code? To effectively check this, what I had to do > is to run on all of L0, L1, and L2, long runs of parallel "make" (make -j3) - > concurrently. Even code which doesn't do floating-point calculations uses > the FPU sometimes for its wide registers, so all these processes, guests > and guest's guests, compete for the FPU, exercising very well this code > path. If the TS bit is handled wrongly, some of these make processes > will die, when one of the compilations dies of SIGSEGV (forgetting to > set the FPU registers leads to some uninitialized pointers being used), > so it's quite easy to exercise this. I'm not focusing on hosting KVM but arbitrary guests. So I looked at it generically, defining what bits should L1 effectively hand over to L0, like real hardware would do when setting CR0/4. And that value was wrongly calculated, breaking in practice when you allow unrestricted guest mode, ie. when L2 starts playing with PE and PG - just to name one prominent scenario. By focusing on TS for KVM-on-KVM without unrestricted guest mode, you weren't able to trigger this. But if you can tell me, where I may pass wrong values to kvm_set_cr0/4, I'm all ears. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html