Am 30.01.2012 14:23, schrieb Avi Kivity: > On 01/30/2012 01:05 PM, Kevin Wolf wrote: >> Am 30.01.2012 11:24, schrieb Avi Kivity: >>> On 01/27/2012 09:23 PM, Kevin Wolf wrote: >>>> Task switches can switch between Protected Mode and VM86. The current >>>> mode must be updated during the task switch emulation so that the new >>>> segment selectors are interpreted correctly and privilege checks >>>> succeed. >>>> >>>> VMX code calculates the CPL from the code segment selector and rflags, >>>> so it needs rflags to be updated in the vcpu struct. SVM stores the DPL >>>> of the code segment instead, so we must be sure to give the right one >>>> when updating the selector. >>> >>> svm has vmcb_save_area::cpl; it's independent of CS.RPL. >> >> Right, but cpl in the VMCB is updated when you reload a segment (and I >> think only then), > > Gah, it's broken. It should be qualified by more - real mode should > keep cpl at zero, vm86 at 3. And setting rflags.vm86 should update cpl > as well. > > For example, live migration while in real mode with cs&3!=0 or in vm86 > mode with cs&3==0 would set the wrong cpl. Ah, right, I didn't think of this case. Not sure if I should fix it in this series. If we do the right fixes to the task switch, it won't be strictly needed for the bug I'm fixing. >> so it's closely related. >> >>>> ctxt->ops->get_segment(ctxt, &dummy, &desc, &base3, seg); >>>> + >>>> + if (ctxt->mode == X86EMUL_MODE_REAL) >>>> + desc.dpl = 0; >>> >>> Can't happen. >> >> set_segment_selector is only called during task switches right now, so >> yes, this is impossible. Want me to drop the check? Or BUG()? > > BUG()s are dangerous in rarely used code since they can be exploited as > a DoS. So maybe a WARN_ON_ONCE(). Ok. > >>>> @@ -2273,6 +2281,24 @@ static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt, >>>> return emulate_gp(ctxt, 0); >>>> ctxt->_eip = tss->eip; >>>> ctxt->eflags = tss->eflags | 2; >>>> + >>>> + /* >>>> + * If we're switching between Protected Mode and VM86, we need to make >>>> + * sure to update the mode before loading the segment descriptors so >>>> + * that the selectors are interpreted correctly. >>>> + * >>>> + * Need to get it to the vcpu struct immediately because it influences >>>> + * the CPL which is checked at least when loading the segment >>>> + * descriptors and when pushing an error code to the new kernel stack. >>>> + */ >>>> + if (ctxt->eflags & X86_EFLAGS_VM) >>>> + ctxt->mode = X86EMUL_MODE_VM86; >>>> + else >>>> + ctxt->mode = X86EMUL_MODE_PROT32; >>>> + >>> >>> Shouldn't this be done after the set_segment_selector() block? My >>> interpretation of the SDM is that if a fault happens while loading >>> descriptors the fault happens with old segment cache, that is, it needs >>> to be interpreted according to the old mode. >> >> This is closely related to my argument with Gleb whether CPL changes >> when rflags is reloaded or if it only happens when cs is reloaded. I >> argued that it makes more sense to couple it with cs, so I guess I have >> to agree with you mostly. >> >> The SDM says that any faults during loading the segment descriptors >> happen in the context of the new task, and the task switch is completed >> before an exception is generated. So it shouldn't make any difference to >> the guest what the exact point is at which we change CPL, it's an KVM >> internal decision. >> >> So what about this order: >> >> 1. Reload general purpose registers and eflags without updating mode >> or writing back rflags to the vcpu struct. >> >> 2. Load segment selectors without changing the DPL yet. > > (or changing anything in the segment cache) > >> >> 3. Load segment descriptors, and disable the privilege checks in >> load_segment_descriptor using a new flag. > > I don't like doing things that aren't directly traceable to the SDM. > Perhaps we should pass the cpl as a parameter to > load_segment_descriptor(). Or we should ->set_cpl() just before. I like the idea of having a ->set_cpl(). For SVM it should be trivial to implement. For VMX, I think instead of clearing VCPU_EXREG_CPL, vcpu_vmx->cpl should directly be updated in set_rflags and set_segment (and in the new set_cpl, obviously). Would that be enough or would we have to avoid clearing it in all other places as well? Where would it be initialised if it's not enough? >> For SVM, this updates >> the CPL when cs is reloaded. > > > >> >> 4. Call ctxt->ops.set_rflag so that VMX updates the CPL. Should be >> cleaned up in a follow-up patch, so that VMX uses set_segment >> to update the CPL, like SVM does. >> >> Would this match your interpretation? > > Not exactly. I claim that the cpl is a separate state from > cs.cpl/ss.rpl/cr0.pe/eflags.vm - it cannot be derived just from those. > On the transition from real mode to protected mode, you can have (say) > cs=0x13 and cpl=0. But even with cs=0x13 cs.dpl would still be 0 (in the segment cache) before cs is reloaded and you switch to a real protected mode segment, right? > Outside of mode switches, it's good to have set_segment() update the cpl > automatically. But inside mode switches it can screw up further checks > or aborts. Can you give a specific example of how it screws things up? >>>> static int emulator_get_cpl(struct x86_emulate_ctxt *ctxt) >>>> { >>>> return kvm_x86_ops->get_cpl(emul_to_vcpu(ctxt)); >>>> @@ -4199,6 +4204,7 @@ static struct x86_emulate_ops emulate_ops = { >>>> .set_idt = emulator_set_idt, >>>> .get_cr = emulator_get_cr, >>>> .set_cr = emulator_set_cr, >>>> + .set_rflags = emulator_set_rflags, >>>> .cpl = emulator_get_cpl, >>>> .get_dr = emulator_get_dr, >>>> .set_dr = emulator_set_dr, >>> >>> It would be good to switch the entire emulator to use ->set_rflags() >>> instead of sometimes letting the caller do it. >> >> If we change the CPL only with a cs reload, set_rflags can be dropped >> completely in the end. > > That's attractive, yes. > > Suppose load_segment_descriptor(..., VCPU_SREG_LDTR) fails. What should > the new cpl be? Does it matter? I think the one matching the new cs/eflags. The SDM says that the task switch is completed without further segment checks and then an exception is thrown. Kevin -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html